Glossary

  • Big Data vs Lots of Data (from Forbes)
  • Data Mining – aka, Data Discovery or Knowledge Discovery from Databases (KDD), is the process of discovering and reporting on patterns and trends in datasets, using approaches of statistics and machine learning (ML), such as anomaly detection, clustering, classification, regression, and other.
  • Predictive Analytics – extends on Data Mining techniques to include predictive modeling to forecast on future patterns and trends and deployment of the models. That is a cornerstone process used in bioinformatics: get data (e.g.,  peptides binding a specific MHC allele); explore them with statistical approaches to decide if they are good to use; divide the dataset on the training and test sets; apply various regression and ML techniques to build the best predictive model; if performance of the model is satisfactory (depends on the field), release it as an executable code or a web-tool so others could use it to predict binding for their peptides of interest.
  • Geospatial Data – well explained in this BigData-Startups post 

Declaimer: I try my best in providing the links I found to be the most descriptive and give a very brief definition in my own words, which reflects my understanding of the topic/subject/terms and relations among them.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s