Skip to Main Content

Library & Information Services Sub Menu Dismiss

Data Science: Programming Tools

Tools, software packages, and library resources for doing big data and data science.

Learning Python

Python Libraries

Statistics

  • pymc
    pymc implements the Metropolis-Hastings algorithm as a python class. Bayesian estimation, Markov chain Monte Carlo (MCMC)
  • PyMVPA
    A Python package offers algorithms for classification, regression, feature selection, data import and export
    Works well with scikit-learn, and MDP

Natural Language Processing

Machine Learning

Social Media

  • Twython
    Twython is the premier Python library providing a method to access Twitter data.

Visualization

  • Orange 
    Open source data visualization and analysis tool. Data mining through visual programming or Python scripting. Add-ons for bioinformatics and text mining are available.

Python Applications