=head1 IronMan-Web release issues. Remaining issues for live deployment on 14th of April 2010. =head2 Page banner. - DONE - IDN The banner as follows: Are you flesh? Or are you Iron? Take the challenge http://ironman.enlightenedperl.org/signup/new_feed Missing from the top of the page =head2 Paging. The detail between the banner detailed above and the very first feed post: Join the program | Learn about the program | Report a problem Only showing posts tagged "perl", "cpan" or "ironman" (or containing those words). Last updated: 21:25:27 13-Apr-2010 First Previous 1 2 3 4 5 Next Last The above is missing at the top of the page, though the Older and Newer posts links are shown at the bottom of the page. =head2 Captcha on signup form. Castaway has added a recaptcha to the signup form on the existing site. This needs back porting to the dev site. =head2 Branding. These are minor issues that are inconsistent with the existing UI. Whilst we need to fix these up, we also need provision to have the all.things.per.ly branding restored in the future. Please bear this in mind when making the changes such that this can be achieved simply. Page title should be "Planet Perl Iron Man" The epo banner at the top should read: enlightened perl organisation enlightened |en'litnd|: adjective: having or showing a rational, modern, and well-informed outlook Header in the yellow bar at the top of the page should similarly be: Planet Perl Iron Man Footer should be: Perl Iron Man Planet or: Planet Perl Iron Man The second would be more consistent, though the former is the existing text. =head2 Images. For some reason images are being removed from posts. No idea why this might be but it's a degredation to existing functionality if we can't figure it out. =head2 Article headers. Tags and such are missing from the article headers. I'm sure this has been fixed in the data move once, but I don't recall how :( =head1 IronMan-Web feature requests. =head2 OPML file link. The existing site shows a link to download the OPML file. We need something here to download a similar SQLite file with the email addresses removed. =head2 Spam handling. The existing site has many spam handling issues. There's lots of crap in the database. We need: =over =item * Tools to remove the existing crap from the database. =item * Support included into Perlanet-IronMan to try and limit the amount that then re-appears. A number of people have suggested that using Spam Assassin might be a starting point for scanning existing content and possibly new content. =item * We need a quick and simple way to remove spam feeds once they're identified. This should probably be by feed or post URL. From IRC discussion: 12:55 < mst> for the live one it's easy 12:55 < mst> just cp the sqlite db first 12:55 < castaway> on ironman I always do: login, cd plagger, cp subscriptions.db scubscriptions_pewdespam.db; sqlite3 subscriptions.db 12:56 < castaway> any mess, re copy and start again ;) 12:56 < castaway> backups++ =item * A name should not appear on the index page until at least one post has been made. =back Note from robinsmidsrod in #epo-ironman: 12:20 < robinsmidsrod> idn: I just wanted to suggest to you to use the http_bl support from https://www.projecthoneypot.org to reduce spam entering the ironman database - I've successfully used it on my blog - now I barely have spam entering my blog, and I don't have a captcha installed 12:21 < robinsmidsrod> I used the mod_httpbl apache implementation from https://www.projecthoneypot.org/httpbl_implementations.php 12:21 < idn> robinsmidsrod: Thanks for the suggestion, I'll stick it in the todo list for investigation. 12:23 < robinsmidsrod> sorry, my bad, I didn't use the mod_httpbl module - I actually used a b2evolution plugin - but projecthoneypot has a simple DNS-based API, so it shouldn't be to hard to make a perl module to handle it 12:23 < robinsmidsrod> it works just as any other DNS-based blacklist 12:24 < robinsmidsrod> but the cool thing is that you can actually choose the threshold level for when you will block users Discussion followed: 12:24 < idn> That's an interesting idea that I hadn't thought of. 12:25 < idn> How would you look to implement it, at collection time or at signup time? 12:25 < robinsmidsrod> actually, if you run your own DNS server (and http server) I would suggest to support the project - it is an awesome project (I've been a member for a bit over a year) 12:25 < robinsmidsrod> the new site is dynamic, right? 12:25 < idn> Yes 12:26 < idn> Hmm, I'd love to support the project, but my employer isn't very community minded or responsible in that respect. 12:26 < robinsmidsrod> so just look up REMOTE_IP, do a lookup against projecthoneypot BL (via DNS) and check the response - if it shows something that looks bad, just block or redirect the user to a page that explains the problem, or enable captcha 12:26 < robinsmidsrod> idn: well, I support it with my private stuff 12:27 < robinsmidsrod> anyone can donate a spare MX pointer ;) 12:27 < robinsmidsrod> for some hostname you would probably never use 12:27 < idn> Ah, I'm with you. 12:28 < robinsmidsrod> mine is XXXmailserver.smidsrod.no which points to a honeypot 12:29 < idn> So all I need is some domains rather than any actual kit ;) All of mine are on 123reg which neatly solves that problem. 12:30 < robinsmidsrod> me alone have helped catching approx. 20 harvesters and spammers in the last year 12:30 < robinsmidsrod> which is nice to know :) 12:31 < robinsmidsrod> you can support them by donating an MX entry in your DNS, setting up a "hidden" link on your own websites linking to a honeypot, or you can setup an actual honeypot - I've only done the two first 12:32 < idn> That seems like a good idea, I'll put it forward to the boss too and see if we can't do something here at work. 12:32 < robinsmidsrod> what I do in my blog is that if the remote_ip looks suspicies I redirect the user to my honeypot page, which means that those IPs that are already somewhat fishy will be redirected to something that will make them more fishy if they harvest it :) 12:33 < idn> I had been contemplating using spam assassin to scan content too 12:34 < robinsmidsrod> idn: this is my honeypot page: http://minmailserver.smidsrod.no/ 12:34 < robinsmidsrod> if you look at the HTML content you'll see that there are some hidden links that harvesters will catch 12:35 < idn> There are a couple of problems to address, one is the existing bad feeds (most of which don't appear because they don't use the right keywords) and secondly preventing new bad feeds. 12:36 < idn> The former is more of a problem due to the way in which the list of signed up users appears on the front page. 12:36 < robinsmidsrod> honeypot will only help with the new bad feeds 12:36 < idn> Possibly, the site hosting the spam might well be listed 12:37 * mst still thinks "only appears in the right bar if they've got at least one post" would be a start 12:37 < robinsmidsrod> if you have any IP-adresses linked with existing content you could of course manually run it throuh their BL and see what you find 12:37 < robinsmidsrod> mst: I agree with that one 12:37 < mst> actual spam posts will get nailed pretty quickly, I think 12:37 < idn> mst: Yes, that's what's in the todo I think 12:38 < robinsmidsrod> a "report spam" feature is available? 12:38 < castaway> robinsmidsrod: no but that'd be handy, care to write one? 12:38 < idn> I've thought about that and I've mixed opinions. It seems like it could be open to abuse and or create work for someone to deal with. 12:38 < castaway> idn: we run a website, its gonna create work ; 12:38 < castaway> ;) 12:39 < idn> Yes. But I like to try my best to minimise that ;) 12:39 < robinsmidsrod> castaway: I don't have any time available, 200% workload with work + full time studies, but I can explain how it could be created to mitigate moderator intervention. 12:40 < idn> I was contemplating some kind of scoring system that would blacklist a feed once so many reports have been received from different requesting hosts, but it's still a little open to abuse. Coupled with administrative notification and oversight to re-enable if needed. 12:40 < robinsmidsrod> castaway: create a form (POST) with a "Report spam" button so that behaving robots won't access it. When enough people have clicked that button the article will be blacklisted until a moderator actually clears it from blacklist 12:40 < castaway> idn: like, say, bayes? ;) 12:41 < robinsmidsrod> idn: exactly the same as I thought 12:41 < castaway> robinsmidsrod: makes sense.. (user moderation, yay) 12:41 < robinsmidsrod> that way either the author needs to complain to a moderator that his post doesn't show up 12:41 < idn> castaway: Erm, not quite, though my understanding of things statistical could be written on the back of a very small pin head.... 12:41 < robinsmidsrod> because it got blacklisted 12:42 < idn> That wouldn't work in that each and every time the feed generated a spam, it would need to be black listed. Though I like where you're going. 12:42 < robinsmidsrod> mst: do you have a suggestion on how to calculate how many reports should cause blacklisting to be triggered? 12:42 < idn> Blacklist the individual post with some kind of hysteresis, then blacklist the feed once enough posts have been blacklisted. 12:42 < mst> idn: er, what? 12:43 < mst> why would you have to regen? 12:43 < robinsmidsrod> idn: or enable reporting spam on both posts and feeds 12:43 < mst> oh, each time. yes. 12:43 < mst> idn: that's simple. 12:43 < mst> two blacklists and the feed goes. 12:43 < idn> There we go then :) 12:44 < robinsmidsrod> I'd suggest to put the feed in a quarantine so that it is easy for a moderator to un-blacklist feeds - and once a feed has been un-blacklisted you would increase the blacklist threshold 12:45 < robinsmidsrod> sometimes member blogs get hijacked and start generating spam, but I guess that problem is much smaller than spammer blogs in general 12:46 < idn> I wouldn't remove the feed, they could just sign it up again. I'd opt for blacklisting it and never collecting it again 12:47 < robinsmidsrod> if the feed has been in the blacklist for , let's say a month or two, it will be automatically purged from the database 12:47 < robinsmidsrod> what you said makes more sense yes :) 12:47 < robinsmidsrod> and if they try to signup again you redirect them to a projecthoneypot page :) 12:49 < robinsmidsrod> just make sure the report spam button is a form/post button, not a link, or else you'll gather report spam-reports en masse when misbehaving robots come in 12:51 < idn> Hmm, that's a good place to use the honeypot stuff too. 12:52 < robinsmidsrod> catch the spammers/harvesters by their own bad behaviour :) =head2 standards compliance =head3 Atom feed validation L I'm told this involves fixing L. =head2 Gravatar support? 12:12 < poisonbit> gravatar support could be funny 12:13 < idn> poisonbit: Cool. I like. I'll add it to the wishlist. http://en.gravatar.com/ =cut