<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>underused.org</title>
	<atom:link href="http://underused.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://underused.org</link>
	<description>The notoriously underused weblog by Michael Scharkow</description>
	<pubDate>Thu, 10 Jul 2008 22:31:09 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.2-alpha</generator>
	<language>en</language>
			<item>
		<title>Journal of Articles in Support of the Null Hypothesis</title>
		<link>http://underused.org/2008/07/11/journal-of-articles-in-support-of-the-null-hypothesis/</link>
		<comments>http://underused.org/2008/07/11/journal-of-articles-in-support-of-the-null-hypothesis/#comments</comments>
		<pubDate>Thu, 10 Jul 2008 22:28:06 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Science]]></category>

		<category><![CDATA[statistical significance]]></category>

		<guid isPermaLink="false">http://underused.org/2008/07/11/journal-of-articles-in-support-of-the-null-hypothesis/</guid>
		<description><![CDATA[After my PhD supervisor recently advised me to better not end up with non-results in my thesis, JASNH&#160;looks like a very cool plan B for publication.
]]></description>
			<content:encoded><![CDATA[<p>After my PhD supervisor recently advised me to better not end up with non-results in my thesis, <a href="http://www.jasnh.com/">JASNH</a>&nbsp;looks like a very cool plan B for publication.</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2008/07/11/journal-of-articles-in-support-of-the-null-hypothesis/feed/</wfw:commentRss>
		</item>
		<item>
		<title>posterous - Email based blogging redux</title>
		<link>http://underused.org/2008/06/29/posterous-email-based-blogging-redux/</link>
		<comments>http://underused.org/2008/06/29/posterous-email-based-blogging-redux/#comments</comments>
		<pubDate>Sun, 29 Jun 2008 15:57:25 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[tumblelog]]></category>

		<category><![CDATA[Blogs]]></category>

		<category><![CDATA[Web Apps]]></category>

		<guid isPermaLink="false">http://underused.org/2008/06/29/posterous-email-based-blogging-redux/</guid>
		<description><![CDATA[posterous&#160;lets you create a blog by sending one simple email to post@posterous.com. Bidirectional email commenting works, too. Does it get any simpler?
]]></description>
			<content:encoded><![CDATA[<p><a href="http://posterous.com/">posterous</a>&nbsp;lets you create a blog by sending one simple email to post@posterous.com. Bidirectional email commenting works, too. Does it get any simpler?</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2008/06/29/posterous-email-based-blogging-redux/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Harvard enforces Open Access for all research</title>
		<link>http://underused.org/2008/02/15/harvard-enforces-open-access-for-all-research/</link>
		<comments>http://underused.org/2008/02/15/harvard-enforces-open-access-for-all-research/#comments</comments>
		<pubDate>Fri, 15 Feb 2008 17:49:47 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Science]]></category>

		<category><![CDATA[open access]]></category>

		<guid isPermaLink="false">http://underused.org/2008/02/15/harvard-enforces-open-access-for-all-research/</guid>
		<description><![CDATA[Gary King just announced that the Faculty of Arts and Sciences at Harvard unanimously decided to enforce Open Access for all faculty members. This bold move should seriously advance OA world-wide.

I&#8217;d like to see similar steps forward from the German DFG and the likes, or my university. I guess most of the faculty at UdK [...]]]></description>
			<content:encoded><![CDATA[<p>Gary King <a href="http://www.iq.harvard.edu/blog/sss/archives/2008/02/open_access_to.shtml">just announced</a> that the Faculty of Arts and Sciences at Harvard unanimously decided to enforce Open Access for all faculty members. This bold move should seriously advance OA world-wide.</p>

<p>I&#8217;d like to see similar steps forward from the German DFG and the likes, or my university. I guess most of the faculty at UdK does not even know what OA means, and who needs to if there&#8217;s no research anyway ;-)</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2008/02/15/harvard-enforces-open-access-for-all-research/feed/</wfw:commentRss>
		</item>
		<item>
		<title>SimpleBits realigned</title>
		<link>http://underused.org/2008/02/01/simplebits-realigned/</link>
		<comments>http://underused.org/2008/02/01/simplebits-realigned/#comments</comments>
		<pubDate>Fri, 01 Feb 2008 09:05:20 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[tumblelog]]></category>

		<category><![CDATA[Web Design]]></category>

		<guid isPermaLink="false">http://underused.org/2008/02/01/simplebits-realigned/</guid>
		<description><![CDATA[Simplebits gets yet another great mini-redesign.
]]></description>
			<content:encoded><![CDATA[<p><a href="http://simplebits.com/">Simplebits</a> gets yet another great mini-redesign.</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2008/02/01/simplebits-realigned/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Typoscript&#8217;s RECORDS, how I love thee</title>
		<link>http://underused.org/2008/01/06/typoscripts-records-how-i-love-thee/</link>
		<comments>http://underused.org/2008/01/06/typoscripts-records-how-i-love-thee/#comments</comments>
		<pubDate>Sun, 06 Jan 2008 20:21:09 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Hacking]]></category>

		<category><![CDATA[TYPO3]]></category>

		<guid isPermaLink="false">http://underused.org/2008/01/06/typoscripts-records-how-i-love-thee/</guid>
		<description><![CDATA[After hacking it for so many years, there are still some surprising nuggets hidden in TYPO3. Our TYPO3-newbie webmaster Johannes recently pointed me to the RECORDS type in our template which is extremely useful for including portlet-style content elements in your template. In order to include a plugin somewhere in your page template, simply add [...]]]></description>
			<content:encoded><![CDATA[<p>After hacking it for so many years, there are still some surprising nuggets hidden in TYPO3. Our TYPO3-newbie webmaster <a href="http://sitegraph.de">Johannes</a> recently pointed me to the RECORDS type in our template which is extremely useful for including portlet-style content elements in your template. In order to include a plugin somewhere in your page template, simply add it to a sysfolder or hidden page somewhere and refer to the content record like this:</p>

<pre><code>subparts.TAGCLOUD = RECORDS
subparts.TAGCLOUD {
tables = tt_content
source = 444
dontCheckPid = 1
}
</code></pre>

<p>That&#8217;s it. The plugin content is rendered without USER_INT fiddling or COA tricks, configuration is dummy proof with Flexforms which seem to be more popular than TS configuration anyway. You can also include link lists, search forms, user login or normal content etc. like this, all nicely editable by your average users.</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2008/01/06/typoscripts-records-how-i-love-thee/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Scraping Youtube with Beautiful Soup</title>
		<link>http://underused.org/2007/12/11/scraping-youtube-with-beautiful-soup/</link>
		<comments>http://underused.org/2007/12/11/scraping-youtube-with-beautiful-soup/#comments</comments>
		<pubDate>Tue, 11 Dec 2007 09:58:52 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Hacking]]></category>

		<category><![CDATA[BeautifulSoup]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[Youtube]]></category>

		<guid isPermaLink="false">http://underused.org/2007/12/11/scraping-youtube-with-beautiful-soup/</guid>
		<description><![CDATA[For an upcoming project I need to track some usage statistics of  Youtube videos which are not provided via the GData API. The common solution to this problem is screen scraping the HTML pages and extracting the information.

Here&#8217;s a quick howto using Python and the  BeautifulSoup HTML/XML parser. 

First off, we choose a [...]]]></description>
			<content:encoded><![CDATA[<p>For an upcoming project I need to track some usage statistics of  Youtube videos which are not provided via the GData API. The common solution to this problem is screen scraping the HTML pages and extracting the information.</p>

<p>Here&#8217;s a quick howto using Python and the  <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> HTML/XML parser. </p>

<p>First off, we choose a Youtube video page, like <a href="http://youtube.com/watch?v=Xe1a1wHxTyo">this one</a> and stuff it into the BeautifulSoup parser:</p>

<pre><code>#!/usr/bin/env python
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import re # we need regular expressions later

monty_vid = urlopen('http://youtube.com/watch?v=Xe1a1wHxTyo')
page = BeautifulSoup(monty_vid)

print page.prettify()
</code></pre>

<p>The last line pretty-prints the HTML you just retrieved, so you can check if it&#8217;s an existing page or a 404. Next, we&#8217;d like to extract some meta data like title, description and tags for the video. Luckily, those are provided in the HTML head as meta tags, in order, so we can extract the content attribute from those. The result object is a list with elements that act like dictionaries:</p>

<pre><code>meta = {}
meta['title'] = page.head(&#8217;meta&#8217;)[0]['content']
meta['description'] = page.head(&#8217;meta&#8217;)[1]['content']
meta['tags'] = page.head(&#8217;meta&#8217;)[2]['content'].split(&#8217;, &#8216;)
</code></pre>

<p>Notice that all the extracted strings are Unicode, and we made a list of tags by splitting the string. Next up, we want the number of views and the number of ratings. Luckily, the former is available in a span tag with a dedicated class which we can retrieve with the following search on the document body:</p>

<pre><code>views = page.body('span',"viewCount")[0].string
</code></pre>

<p>The number of ratings is not readily marked up, but available as a string like &#8220;55 ratings&#8221;, so we need another technique &#8212; pattern matching within a certain div:</p>

<pre><code>numratings = page.body('div', id='defaultRatingMessage',    
                        text=re.compile('ratings'))[0].string.split()[0]
</code></pre>

<p>The body() method with the text parameter gives us all tags in the named div that match our simple pattern, from which we extract the first part with split()[0]. Finally, we do not only want the number of ratings, but the rating itself. The rating is not available as a number in the document, but indicated by 0 to 5 star images which we can count:</p>

<pre><code>rating = len(page.body('img','rating icn_star_full_19x20png'))
</code></pre>

<p>Of course, there are dozens of different ways to query the HTML for some values, but these work with the current layout. Since you have to update your scraping script with every change, why bother with optimal queries?</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2007/12/11/scraping-youtube-with-beautiful-soup/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google Chart API released</title>
		<link>http://underused.org/2007/12/07/google-chart-api-released/</link>
		<comments>http://underused.org/2007/12/07/google-chart-api-released/#comments</comments>
		<pubDate>Fri, 07 Dec 2007 12:33:09 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Hacking]]></category>

		<category><![CDATA[API]]></category>

		<category><![CDATA[Charts]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://underused.org/2007/12/07/google-chart-api-released/</guid>
		<description><![CDATA[The awesome new Chart API is the probably best Google product since, um, Google Search. Gotta love the text-as-data pattern and the fact that you only need to fill an image tag with some parameters. And did I tell you it&#8217;s fast?! 

(via Tobias Lütke)
]]></description>
			<content:encoded><![CDATA[<p>The awesome new <a href="http://code.google.com/apis/chart/">Chart API</a> is the probably best Google product since, um, Google Search. Gotta love the text-as-data pattern and the fact that you only need to fill an image tag with some parameters. And did I tell you it&#8217;s <em>fast</em>?! </p>

<p>(via <a href="http://blog.leetsoft.com/2007/12/6/google-chart-api">Tobias Lütke</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2007/12/07/google-chart-api-released/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Another housemate on the underused server</title>
		<link>http://underused.org/2007/10/31/another-housemate-on-the-underused-server/</link>
		<comments>http://underused.org/2007/10/31/another-housemate-on-the-underused-server/#comments</comments>
		<pubDate>Wed, 31 Oct 2007 15:46:27 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[underused.org]]></category>

		<category><![CDATA[Hacking]]></category>

		<guid isPermaLink="false">http://underused.org/2007/10/31/another-housemate-on-the-underused-server/</guid>
		<description><![CDATA[Welcome Christian Siefkes, text processing/spam filtering wizard and author of a very cool book on peer economy. Go visit and buy the book!
]]></description>
			<content:encoded><![CDATA[<p>Welcome <a href="http://siefkes.net">Christian Siefkes</a>, text processing/spam filtering wizard and author of a very cool book on <a href="http://www.peerconomy.org/wiki/Main_Page">peer economy</a>. Go visit and buy the book!</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2007/10/31/another-housemate-on-the-underused-server/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Textpattern - a neat CMS alternative</title>
		<link>http://underused.org/2007/10/29/textpattern-a-neat-cms-alternative/</link>
		<comments>http://underused.org/2007/10/29/textpattern-a-neat-cms-alternative/#comments</comments>
		<pubDate>Mon, 29 Oct 2007 19:05:04 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Web Design]]></category>

		<category><![CDATA[CMS]]></category>

		<category><![CDATA[Textpattern]]></category>

		<category><![CDATA[TYPO3]]></category>

		<guid isPermaLink="false">http://underused.org/2007/10/29/textpattern-a-neat-cms-alternative/</guid>
		<description><![CDATA[I don&#8217;t know why I only looked at Textpattern now but it surely looks nice for smaller sites that I set up for friends. I used to do everything with TYPO3 but had a hard time trimming it down for simple use cases with only a couple of pages and no dynamic stuff. Textpattern has [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t know why I only looked at <a href="http://textpattern.com">Textpattern</a> now but it surely looks nice for smaller sites that I set up for friends. I used to do everything with TYPO3 but had a hard time trimming it down for simple use cases with only a couple of pages and no dynamic stuff. Textpattern has a very intuitive templating system and comes with all the categorization and syndication features you might want for a start. The admin interface is ugly but so is TYPO3&#8217;s. And Textile is better than HTMLArea although not as cool as Markdown. I&#8217;ll give TXP a try for the next project.</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2007/10/29/textpattern-a-neat-cms-alternative/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Making the switch from SPSS to R with Quick-R</title>
		<link>http://underused.org/2007/10/25/making-the-switch-from-spss-to-r-with-quick-r/</link>
		<comments>http://underused.org/2007/10/25/making-the-switch-from-spss-to-r-with-quick-r/#comments</comments>
		<pubDate>Thu, 25 Oct 2007 12:22:55 +0000</pubDate>
		<dc:creator>Michael Scharkow</dc:creator>
		
		<category><![CDATA[Science]]></category>

		<category><![CDATA[R]]></category>

		<category><![CDATA[Research]]></category>

		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://underused.org/2007/10/25/making-the-switch-from-spss-to-r-with-quick-r/</guid>
		<description><![CDATA[While reading the truly enlightening Data Analysis Using Regression and Multilevel/Hierarchical Models by Gelman/Hill and preparing for an introductory Bayesian course that is offered at FU Berlin this semester I decided to slowly switch to R more or less completely for all statistical day-to-day work. 

There are some books and articles available specifically for beginners [...]]]></description>
			<content:encoded><![CDATA[<p>While reading the truly enlightening <a href="http://www.stat.columbia.edu/~gelman/arm/">Data Analysis Using Regression and Multilevel/Hierarchical Models</a> by Gelman/Hill and preparing for an introductory Bayesian course that is offered at FU Berlin this semester I decided to slowly switch to <a href="http://r-project.org">R</a> more or less completely for all statistical day-to-day work. </p>

<p>There are some books and articles available specifically for beginners or switchers from SPSS, Stata or SAS, but the most valuable source is a collection of common use cases, including data management and graphs, freely available as <a href="http://www.statmethods.net/">Quick-R</a>.</p>

<p>(via <a href="http://dataninja.wordpress.com/2007/10/23/quick-r-a-great-r-tutorial-site/">Dataninja</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://underused.org/2007/10/25/making-the-switch-from-spss-to-r-with-quick-r/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.383 seconds -->
<!-- Cached page served by WP-Cache -->
<!-- Compression = gzip -->