Skip to Main Content

Data Analysis

Information about selecting data analysis tools and finding datasets

Key Questions before You Get Started

Sports:

  • R
  • Tableau
  • PostgreSQL
  • Excel
  • Apache Spark

 

Sciences:

  • Redash
  • Jupyter Notebook
  • Mode
  • SPSS

Here are some recommendations to get you started: 

 

Small-Medium Datasets:

  • Excel
  • Redash
  • R
  • Jupyter Notebook

 

Large Datasets:

  • Apache Spark
  • SPSS
  • Tableau
  • PostgreSQL
  • Mode

 

 

  • Tableau works with Python
  • PostgreSQL works with Python, C/C++, and SQL.
  • Excel works with Python and VBA.
  • Apache Spark works with Python, Scala, and SQL.
  • Mode works with Python and SQL.
  • SPSS works with Python.
  • Jupyter Notebook works with Python and Scala.
  • Redash works with Python and SQL.
  • R works with C/C++

Software-as-a-Service (SaaS) allows for software use through "the cloud." However, it may be beneficial for you to have the software on your device.

Here is a brief summary to get you started on which option is best for you:

  • SaaS is a good option when you don’t have the time, capital or expertise to build your own applications or host applications on-premises1. It’s also beneficial for startups or small companies that need to launch ecommerce quickly and don’t have time for server issues or software2. Larger companies may use SaaS technology for short-term projects or applications that aren’t needed all year long1.

 

  • Local hosting is a good option when you want full control and autonomy of your installation3. It’s also recommended for customers with advanced technical knowledge as the provider can’t provide any guarantee nor any Service Level Agreement regarding the robustness, performance, and scalability of your infrastructure3.

 

Video games are not the only things with graphic improvements. Graphics for data have become very advanced. Sometimes, you may not need the sophistication that the tool can provide. See the recommendations below to help you make a decision on what tool is best for you. 

 

  • R: ggplot2, lattice, and base R graphics are popular visualization packages in R that can be used to create a wide range of plots and charts.
  • Tableau: Tableau Desktop is a powerful data visualization tool that allows you to create interactive dashboards and reports.
  • PostgreSQL: pgAdmin is a popular open-source administration and management tool for PostgreSQL that includes a graphical query builder and visualization tools.
  • Excel: Excel charts are a built-in feature of Microsoft Excel that allow you to create a wide range of charts and graphs.
  • Apache Spark: Spark SQL is a powerful data processing engine that includes built-in support for data visualization using various libraries such as Matplotlib and Seaborn.
  • Redash: Redash is an open-source data visualization platform that allows you to create interactive dashboards and reports using various visualization tools such as charts, tables, and maps.
  • Juptyr Notebook: Matplotlib, Seaborn, and Plotly are popular data visualization libraries in Python that can be used to create a wide range of plots and charts in Jupyter Notebook.
  • Mode: Mode charts is an easy-to-use data visualization tool that allows you to create interactive charts and dashboards using various chart types such as bar charts, line charts, and scatter plots.
  • SPSS: SPSS charts are built-in features of IBM SPSS Statistics that allow you to create a wide range of charts and graphs.