Rank operator is at PIG! :’)

Ok, maybe it happened some time ago (one month?). But, I was bussy!

Some good news after 2 months and a half of hard work, guided by Gianmarco De Francisci Morales, I don’t have words to thank him for his time and advices along this process.

Now, it’s time for formal documentation (working on, hopefully this week I’ll finish it! hurra!) and use it!

My pleasure:  https://issues.apache.org/jira/browse/PIG-2353

Publicado en Hadoop, Pig | Etiquetado , , | Deja un comentario

Crawling and Parsing with Java

Yes, another java crawler/parser! But, it offers something different that others: it imitates the DOM with OO, in order to parse content.
And of course, you could do it «on-line» or by using a raw file.

So, make some soup and hands on JSoup!

Publicado en Varios | Deja un comentario

Finding names on a raw text

Sometimes is difficult to find out names on a text. Maybe the most naïve way is to get all the words that starts with a capital letter and that’s it! But, it you check on this paragraph you could find names like «Maybe» or «But» (???) So, fortunately, there’re more brilliant ideas like this, on which is used regex with some particular rules, like:

  • A name is composed by two word (minimum) that starts with a capital letter each one.
  • Maybe can be composed by more than two words, like «James Van de Putte» or something similar.
  • Multiple words separated by whitespace.
  • … and so.

This is the final regex string used to parse names (namely, composed names) from a text.

[A-Z]([a-z]+|\.)(?:\s+[A-Z]([a-z]+|\.))*(?:\s+[a-z][a-z\-]+){0,2}\s+[A-Z]([a-z]+|\.)

 

Publicado en Varios | Etiquetado , , | Deja un comentario

Install s3cmd tools on OSX

After looking for a way to access to s3 bucket, I found this self-explanatory site of how to install s3cmd on OSX

Publicado en ec2 | Etiquetado , , | Deja un comentario

Symfony and Mac

After getting this beautiful error:

Warning: PDO::__construct(): [2002] No such file or directory (trying to connect via unix:///var/mysql/mysql.sock) in /some/path/here/symfony/lib/plugins/sfDoctrinePlugin/lib/vendor/doctrine/Doctrine/Connection.php on line 470

PDO Connection Error: SQLSTATE[HY000] [2002] No such file or directory

Some solutions says that first you need to find the php.ini (that’s like looking for the Holy Grail or something similar!). So, the best way is to «join to your enemies» by creating a link to the current folder where the mysql.sock is created through:

cd /var; 
sudo ln -s /Applications/MAMP/tmp/mysql mysql
(Taken from Stackoverflow)

Particularly, I used MAMP.


Publicado en Symfony | Etiquetado , | Deja un comentario