Archivo mensual: octubre 2012

Rank operator is at PIG! :’)

Ok, maybe it happened some time ago (one month?). But, I was bussy! Some good news after 2 months and a half of hard work, guided by Gianmarco De Francisci Morales, I don’t have words to thank him for his time … Sigue leyendo

Publicado en Hadoop, Pig | Etiquetado , , | Deja un comentario

Crawling and Parsing with Java

Yes, another java crawler/parser! But, it offers something different that others: it imitates the DOM with OO, in order to parse content. And of course, you could do it «on-line» or by using a raw file. So, make some soup and … Sigue leyendo

Publicado en Varios | Deja un comentario

Finding names on a raw text

Sometimes is difficult to find out names on a text. Maybe the most naïve way is to get all the words that starts with a capital letter and that’s it! But, it you check on this paragraph you could find … Sigue leyendo

Publicado en Varios | Etiquetado , , | Deja un comentario