Polyglot data analysis

visually

demonstrated with Python and R

 
Laurent Gautier - Jupyter Days Boston 2016

(notebook as HTML| ipynb)

(slides on github)

Slides

http://lgautier.github.io/jpd-pdapr-slides

Acknowledgment

  • Jupyter Days Boston organizers and sponsors
  • rmagic authors and maintainers (Dav Clark)
  • rpy2 contributors
  • rpy2 users (in the hope they become contributors)
bilingual

You* probably have already done polyglot data analysis

(*: yes, you)

Python and SQL


sql = """
select value
from measures
where value > 0
"""

cursor = dbcon.cursor()
cursor.execute(sql)
result=cursor.fetchall()
	    
  • Hardly groundbreaking.
  • That's polyglot data analysis though.
  • Without SQL ?

Python and ORMs


cursor = dbcon.cursor()
sql = """
select value
from measures
where value > 0
"""
cursor.execute(sql)
result = cursor.fetchall()
		    


result = (Measure
          .select()
          .where(Measure.value > 0))
	    
(example with SQLObject)

Python vs R ?

PythonR !

R-in-Python
R-in-Python
R-in-Python
R-in-Python
R-in-Python

Ladies and gentlemen, start your engines!

Jupyter Notebook pothole data (source: https://data.cambridgema.gov)

The easy way:


docker run --rm -it -p 8888:8888 \
            rpy2/jpd-pdapr-slides
	    
Visit http://localhost:8888/ *

*: If docker-machine (Windows, OS X), the address is:


	      docker-machine ip [MACHINE]
	    

Slides

http://lgautier.github.io/jpd-pdapr-slides

Acknowledgment

  • Jupyter Days Boston organizers and sponsors
  • rmagic authors and maintainers (Dav Clark)
  • rpy2 contributors
  • rpy2 users (in the hope they become contributors)