{"id":235,"date":"2018-08-01T05:04:33","date_gmt":"2018-08-01T04:04:33","guid":{"rendered":"http:\/\/blog.espol.edu.ec\/xallam\/?p=235"},"modified":"2019-01-09T18:43:05","modified_gmt":"2019-01-09T17:43:05","slug":"python-pandas-ii","status":"publish","type":"post","link":"https:\/\/blog.espol.edu.ec\/xallam\/2018\/08\/01\/python-pandas-ii\/","title":{"rendered":"Python Pandas - II"},"content":{"rendered":"<div style=\"width: 1210px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium\" src=\"https:\/\/media.metrolatam.com\/2018\/01\/22\/justiceleagueavengersdccomicsmarvelportada-1200x600.jpg\" width=\"1200\" height=\"600\" \/><p class=\"wp-caption-text\">Superh\u00e9roes<\/p><\/div>\n<p>En la segunda parte del tutorial, revisaremos el agrupamiento de datos.<\/p>\n<h2>Selecci\u00f3n de columnas<\/h2>\n<p>Trabajar con todas las columnas de un conjunto de datos resulta un poco inc\u00f3modo por lo que puedes utilizar el\u00a0<em>subsetting\u00a0<\/em>al enviar una lista con el nombre de las columnas que deseas.<\/p>\n<div class='dropshadowboxes-container ' style='width:auto;'>\r\n                            <div class='dropshadowboxes-drop-shadow dropshadowboxes-rounded-corners dropshadowboxes-inside-and-outside-shadow dropshadowboxes-lifted-both dropshadowboxes-effect-default' style=' border: 1px solid #dddddd; height:; background-color:#ffffff;    '>\r\n                            nuevosDatos = superheroes[['name','Publisher']]<br \/>\nstarwars = nuevosDatos['Publisher'] == 'George Lucas'<br \/>\nprint(nuevosDatos[starwars])<br \/>\n\r\n                            <\/div>\r\n                        <\/div>\n<p>En las instrucciones anteriores creamos un nuevo\u00a0<strong>conjunto de datos\u00a0<\/strong>llamado\u00a0<em>nuevosDatos<\/em> con el que haremos operaciones, aplicamos condiciones, etc.<\/p>\n<h2>Agrupamiento<\/h2>\n<p>En algunas ocasiones, es necesario realizar operaciones de c\u00e1lculo (count, sum, mean, max, min) para los diversos grupos de datos, por ejemplo:<\/p>\n<p>\"N\u00famero total de superh\u00e9roes <strong>por<\/strong> editorial\" o \"M\u00e1xima altura de superh\u00e9roes <strong>de acuerdo<\/strong> a su alineaci\u00f3n y g\u00e9nero\"<\/p>\n<p style=\"text-align: left\"><a href=\"http:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-271\" src=\"http:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png\" alt=\"\" width=\"26\" height=\"26\" srcset=\"https:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png 256w, https:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31-150x150.png 150w\" sizes=\"auto, (max-width: 26px) 100vw, 26px\" \/><\/a>Para responder al\u00a0\"Total de superh\u00e9roes <strong>por<\/strong> editorial\":<\/p>\n<ol>\n<li>Utiliza el <strong>groupby<\/strong> por la o las columnas que se solicitan<div class='dropshadowboxes-container ' style='width:auto;'>\r\n                            <div class='dropshadowboxes-drop-shadow dropshadowboxes-rounded-corners dropshadowboxes-inside-and-outside-shadow dropshadowboxes-lifted-both dropshadowboxes-effect-default' style=' border: 1px solid #dddddd; height:; background-color:#ffffff;    '>\r\n                            grupos = superheroes.groupby('Publisher')\r\n                            <\/div>\r\n                        <\/div><\/li>\n<li>Seleccionar la o las columnas solicitadas<div class='dropshadowboxes-container ' style='width:auto;'>\r\n                            <div class='dropshadowboxes-drop-shadow dropshadowboxes-rounded-corners dropshadowboxes-inside-and-outside-shadow dropshadowboxes-lifted-both dropshadowboxes-effect-default' style=' border: 1px solid #dddddd; height:; background-color:#ffffff;    '>\r\n                            columnas = grupos[['Publisher']] \r\n                            <\/div>\r\n                        <\/div><\/li>\n<li>Realizar la operaci\u00f3n que se solicita, en este caso el n\u00famero total hace referencia a contar todos los valores.<div class='dropshadowboxes-container ' style='width:auto;'>\r\n                            <div class='dropshadowboxes-drop-shadow dropshadowboxes-rounded-corners dropshadowboxes-inside-and-outside-shadow dropshadowboxes-lifted-both dropshadowboxes-effect-default' style=' border: 1px solid #dddddd; height:; background-color:#ffffff;    '>\r\n                            print(columnas.count())\r\n                            <\/div>\r\n                        <\/div><\/li>\n<\/ol>\n<p>El resultado es el siguiente:<br \/>\n\n<table id=\"tablepress-1\" class=\"tablepress tablepress-id-1\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Publisher<\/th><th class=\"column-2\">Publisher<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">ABC Studios<\/td><td class=\"column-2\">4<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">DC Comics<\/td><td class=\"column-2\">215<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">Dark Horse Comics<\/td><td class=\"column-2\">18<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">George Lucas<\/td><td class=\"column-2\">14<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">Hanna-Barbera<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">HarperCollins<\/td><td class=\"column-2\">6<\/td>\n<\/tr>\n<tr class=\"row-8\">\n\t<td class=\"column-1\">IDW Publishing<\/td><td class=\"column-2\">4<\/td>\n<\/tr>\n<tr class=\"row-9\">\n\t<td class=\"column-1\">Icon Comics<\/td><td class=\"column-2\">4<\/td>\n<\/tr>\n<tr class=\"row-10\">\n\t<td class=\"column-1\">Image Comics<\/td><td class=\"column-2\">14<\/td>\n<\/tr>\n<tr class=\"row-11\">\n\t<td class=\"column-1\">J. K. Rowling<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-12\">\n\t<td class=\"column-1\">J. R. R. Tolkien<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-13\">\n\t<td class=\"column-1\">Marvel Comics<\/td><td class=\"column-2\">388<\/td>\n<\/tr>\n<tr class=\"row-14\">\n\t<td class=\"column-1\">Microsoft<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-15\">\n\t<td class=\"column-1\">NBC - Heroes<\/td><td class=\"column-2\">19<\/td>\n<\/tr>\n<tr class=\"row-16\">\n\t<td class=\"column-1\">Rebellion<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-17\">\n\t<td class=\"column-1\">Shueisha<\/td><td class=\"column-2\">4<\/td>\n<\/tr>\n<tr class=\"row-18\">\n\t<td class=\"column-1\">Sony Pictures<\/td><td class=\"column-2\">2<\/td>\n<\/tr>\n<tr class=\"row-19\">\n\t<td class=\"column-1\">South Park<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-20\">\n\t<td class=\"column-1\">Star Trek<\/td><td class=\"column-2\">6<\/td>\n<\/tr>\n<tr class=\"row-21\">\n\t<td class=\"column-1\">SyFy<\/td><td class=\"column-2\">5<\/td>\n<\/tr>\n<tr class=\"row-22\">\n\t<td class=\"column-1\">Team Epic TV<\/td><td class=\"column-2\">5<\/td>\n<\/tr>\n<tr class=\"row-23\">\n\t<td class=\"column-1\">Titan Books<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-24\">\n\t<td class=\"column-1\">Universal Studios<\/td><td class=\"column-2\">1<\/td>\n<\/tr>\n<tr class=\"row-25\">\n\t<td class=\"column-1\">Wildstorm<\/td><td class=\"column-2\">3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/p>\n<p><a href=\"http:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-271\" src=\"http:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png\" alt=\"\" width=\"26\" height=\"26\" srcset=\"https:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31.png 256w, https:\/\/blog.espol.edu.ec\/xallam\/files\/2018\/07\/attention_PNG31-150x150.png 150w\" sizes=\"auto, (max-width: 26px) 100vw, 26px\" \/><\/a>Para el segundo ejemplo: \"M\u00e1xima altura de superh\u00e9roes <strong>de acuerdo<\/strong> a su alineaci\u00f3n y g\u00e9nero\"<\/p>\n<p>Las columnas a agrupar son \"Alignment\" y \"Gender\". Las columnas a escoger son\u00a0\"Alignment\", \"Gender\" y \"Height\". Finalmente, la operaci\u00f3n es max.<\/p>\n<p>El resultado se interpreta que<\/p>\n<ul>\n<li>El superh\u00e9roe masculino - malo de mayor altura mide\u00a0366.0, y que<\/li>\n<li>El superh\u00e9roe femenino - malo de mayor altura mide\u00a0218.0<\/li>\n<\/ul>\n\n<table id=\"tablepress-3\" class=\"tablepress tablepress-id-3\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Alignment<\/th><th class=\"column-2\">Gender<\/th><th class=\"column-3\">Height<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">-<\/td><td class=\"column-2\">-<\/td><td class=\"column-3\">-99.0<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">-<\/td><td class=\"column-2\">Male<\/td><td class=\"column-3\">229.0<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">bad<\/td><td class=\"column-2\">-<\/td><td class=\"column-3\">198.0<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">bad<\/td><td class=\"column-2\">Female<\/td><td class=\"column-3\">218.0<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">bad<\/td><td class=\"column-2\">Male<\/td><td class=\"column-3\">366.0<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">good<\/td><td class=\"column-2\">-<\/td><td class=\"column-3\">193.0<\/td>\n<\/tr>\n<tr class=\"row-8\">\n\t<td class=\"column-1\">good<\/td><td class=\"column-2\">Female<\/td><td class=\"column-3\">366.0<\/td>\n<\/tr>\n<tr class=\"row-9\">\n\t<td class=\"column-1\">good<\/td><td class=\"column-2\">Male<\/td><td class=\"column-3\">975.0<\/td>\n<\/tr>\n<tr class=\"row-10\">\n\t<td class=\"column-1\">neutral<\/td><td class=\"column-2\">-<\/td><td class=\"column-3\">-99.0<\/td>\n<\/tr>\n<tr class=\"row-11\">\n\t<td class=\"column-1\">neutral<\/td><td class=\"column-2\">Female<\/td><td class=\"column-3\">183.0<\/td>\n<\/tr>\n<tr class=\"row-12\">\n\t<td class=\"column-1\">neutral<\/td><td class=\"column-2\">Male<\/td><td class=\"column-3\">876.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p>Algunos ejercicios para que practicar:<\/p>\n<ul>\n<li>El promedio de estatura por g\u00e9nero. Aseg\u00farate de eliminar los valores -99 antes de cualquier c\u00e1lculo.<\/li>\n<li>M\u00ednimo peso y estatura de los superh\u00e9roes Alien y Mutants por editorial<\/li>\n<li>\u00bfCu\u00e1ntos superh\u00e9roes existen por color de ojos?<\/li>\n<li>\u00bfCu\u00e1ntos superh\u00e9roes existen por color de cabello?<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>En la segunda parte del tutorial, revisaremos el agrupamiento de datos. Selecci\u00f3n de columnas Trabajar con todas las columnas de un conjunto de datos resulta un poco inc\u00f3modo por lo que puedes utilizar el\u00a0subsetting\u00a0al enviar una lista con el nombre &hellip; <a href=\"https:\/\/blog.espol.edu.ec\/xallam\/2018\/08\/01\/python-pandas-ii\/\">Sigue leyendo <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":16,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[146853,20586],"tags":[6553,297,20777],"class_list":["post-235","post","type-post","status-publish","format-standard","hentry","category-coding","category-pandas","tag-datasets","tag-programacion","tag-python"],"_links":{"self":[{"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/posts\/235","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/comments?post=235"}],"version-history":[{"count":22,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/posts\/235\/revisions"}],"predecessor-version":[{"id":324,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/posts\/235\/revisions\/324"}],"wp:attachment":[{"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/media?parent=235"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/categories?post=235"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.espol.edu.ec\/xallam\/wp-json\/wp\/v2\/tags?post=235"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}