The Hacker Within - Davis Chapter

A monthly meetup of researchers, scientists, and engineers who are always learning new tips and tricks to make their computational and data work flows better.

Website [http://www.thehackerwithin.org/davis/]

Email [https://lists.ucdavis.edu/sympa/info/thehackerwithin]

Twitter [@hackerwithin]

Github [https://github.com/thehackerwithin/davis]

Supported by the UCD Data Science Initiative

http://datascience.ucdavis.edu/

Goals

  • Teach each other computing and data skills that supports our research needs
  • Build community around scientific computing and data technologies (students, researchers, engineers, staff, faculty, etc)
  • Engage at all levels of the computational scientific endeavor.
  • Language and tool neutral (except for openness)
  • Focus on tutorials, discussion, lightning talks, and hacking

Notes

  • Modeled after similar groups at UC Berkely and University of Wisconsin Madison

In [1]:
import pandas as pd
df = pd.read_csv('WorkshopSurvey.csv')
topics = df.filter(regex=r'^Please.', axis=1)
topics.columns = [t.split('[')[1][:-1] for t in topics.columns]
print('Number of responses: {}'.format(len(topics)))
topics.head(3)


Number of responses: 52
Out[1]:
Data Visualization Principles Visualization: Dynamic & Interactive Statistical/Machine Learning methods Natural Language Processing (NLP) Hadoop & MapReduce UNIX Shell Tools & Programming Introduction to R Introduction to Python Parallel Programming Using Graphical Processing Units (GPUs) Version Control with git Reproducible Computations, Dynamic Documents with R and/or IPython Introduction to Cloud Computing Basics of Databases
0 Very interested Very interested Very interested Very interested Very interested Very interested Only slightly interested Very interested Very interested Very interested Very interested Very interested Very interested Very interested
1 Very interested Interested Very interested Only slightly interested Only slightly interested Not interested Only slightly interested Very interested Very interested Interested Not interested Interested Only slightly interested Only slightly interested
2 Very interested Very interested Very interested Interested Only slightly interested Interested Not interested Not interested Interested Only slightly interested Only slightly interested Very interested Only slightly interested Only slightly interested

In [2]:
interest = topics.apply(pd.Series.value_counts)
interest = interest.reindex(['Very interested', 'Interested',
                             'Only slightly interested', 'Not interested'])

%matplotlib inline
from IPython.core.pylabtools import figsize
figsize(10, 8)
interest.T.sort_values('Very interested').plot(kind='bar', stacked=True)


Out[2]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f60728fdcf8>

In [3]:
for row in df['Suggest other topics for Workshops, Tutorials or Seminars'].dropna():
    print(row)
    print('\n')


Spatial Statistics, Game theory/decision making, Big data Applications in Economics, Business, GIS and etc, Training on presentation and writing skills on data science.


Hi! This is such a great idea!!! Is there a listserv I can join with updates and event announcements? I'm in the astronomy/physics dept. My UCD email is <retracted>, I would love to join if possible!

Ali


Introduction to R:
Titanic: Machine Learning from Disaster
<https://www.kaggle.com/c/titanic-gettingStarted>
<https://www.kaggle.com/c/titanic-gettingStarted/details/new-getting-started-with-r>


data scraping from the web.


Data mining


In the health fields, we could really use better support, and instruction, for multilevel modeling (i.e. mixed effects models).  We are starting to get a lot of repeated measures (time series) data about people's health, but few health researchers know how to analyze it.  And some of us have found that the statistical consulting available to us here at UCD is not real strong in this area.


Spatial Data Analysis, Spatial Modeling


I am an incoming graduate student of the Hydrologic Sciences Graduate Group of LAWR. I have little background in computing (some Matlab and ArcGIS model builder experience), and will utilize a lot of remote sensing images for my research. So, anything relating to processing these images (e.g. automation via Python, machine learning, speeding up the process) would


SQL, the basic of Spark and Hive


Discussion Topics

  • Do we keep meeting?
  • What should our purpose and goals be? Different that proposed?
  • How often to meet?
  • When to meet?
  • Assign tutorial topics for next quarters.