`## Machine Learning` |
`### Machine Learning: The Basics` |
`Topics to review so you don't get weeded out.` |
`* Supervised learning` |
`* Unsupervised learning` |
`* Semi-supervised learning` |
`* Modeling business decisions usually uses supervised and unsupervised learning.` |
`* Classification and regression are the most commonly seen machine learning models.` |
`### Machine Learning: The Full Topics List` |
`A longer, fuller list of topics:` |
`* Regression` |
` * **Modeling relationship between variables, iteratively refined using an error measure.**` |
` * Linear Regression` |
` * Logistic Regression` |
` * OLS (Ordinary Least Squares) Regression` |
` * Stepwise Regression` |
` * MARS (Multivariate Adaptive Regression Splines)` |
` * LOESS (Locally Estimated Scatterplot Smoothing)` |
`* Instance Based` |
` * **Build up database of data, compare new data to database; winner-take-all or memory-based learning.**` |
` * k-Nearest Neighbor` |
` * Learning Vector quantization` |
` * Self-Organizing Map` |
` * Localy Weighted Learning` |
`* Regularization` |
` * **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.**` |
` * Ridge Regression` |
` * LASSO (Least Absolute Shrinkage and Selection Operator)` |
` * Elastic Net` |
` * LARS (Least Angle Regression)` |
`* Decision Tree` |
` * **Construct a model of decisions made on actual values of attributes in the data.**` |
` * Classification and Regression Tree` |
` * CHAID (Chi-Squared Automatic Interaction Detection)` |
` * Conditional Decision Trees` |
`* Bayesian` |
` * **Methods explicitly applying Bayes' Theorem for classification and regression problems.**` |
` * Naive Bayes` |
` * Gaussian Naive Bayes` |
` * Multinomial Naive Bayes` |
` * Bayesian Network` |
` * BBN (Bayesian Belief Network)` |
`* Clustering` |
` * **Centroid-based and hierarchical modeling approaches; groups of maximum commonality.**` |
` * k-Means` |
` * k-Medians` |
` * Expectation Maximization` |
` * Hierarchical Clustering` |
`* Association Rule Algorithms ` |
` * **Extract rules that best explain relationships between variables in data.**` |
` * Apriori algorithm` |
` * Eclat algorithm` |
`* Neural Networks` |
` * **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.**` |
` * Radial Basis Function Network (RBFN)` |
` * Perceptron` |
` * Back-Propagation` |
` * Hopfield Network` |
`* Deep Learning` |
` * **Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.**` |
` * Convolutional Neural Network (CNN)` |
` * Recurrent Neural Network (RNN)` |
` * Long-Short-Term Memory Network (LSTM)` |
` * Deep Boltzmann Machine (DBM)` |
` * Deep Belief Network (DBN)` |
` * Stacked Auto-Encoders` |
`* Dimensionality Reduction` |
` * **Find inherent structure in data, in an unsupervised manner, to describe data using less information.**` |
` * PCA` |
` * t-SNE` |
` * PLS (Partial Least Squares Regression)` |
` * Sammon Mapping` |
` * Multidimensional Scaling` |
` * Projection Pursuit` |
` * Principal Component Regression` |
` * Partial Least Squares Discriminant Analysis` |
` * Mixture Discriminant Analysis` |
` * Quadratic Discriminant Analysis` |
` * Regularized Discriminant Analysis` |
` * Linear Discriminant Analysis` |
`* Ensemble` |
` * **Models composed of multiple weaker models, independently trained, that provide a combined prediction.**` |
` * Random Forest` |
` * Gradient Boosting Machines (GBM)` |
` * Boosting` |
` * Bootstrapped Aggregation (Bagging)` |
` * AdaBoost` |
` * Stacked Generalization (Blending)` |
` * Gradient Boosted Regression Trees` |
`## Software Engineering ` |
`### Software Engineering: The Basics` |
`Topics to review so you don't get weeded out.` |
`[Five essential screening questions](https://sites.google.com/site/steveyegge2/five-essential-phone-screen-questions):` |
`* Coding - writing simple code with correct syntax (C, C++, Java).` |
`* Object Oriented Design - basic concepts, class models, patterns.` |
`* Scripting and Regular Expressions - know your Unix tooling.` |
`* Data Structures - demonstrate basic knowledge of common data structures.` |
`* Bits and Bytes - know about bits, bytes, and binary numbers. ` |
`Things you absolutely, positively **must** know:` |
`* Algorithm complexity` |
`* Sorting - know how to sort, know at least 2 O(n log n) sort methods (merge sort and quicksort)` |
`* Hashtables - the most useful data structure known to humankind.` |
`* Trees - this is basic stuff, BFS/DFS, so learn it.` |
`* Graphs - twice as important as you think they are.` |
`* Other Data Structures - fill up your brain with other data structures.` |
`* Math - discrete math, combinatorics, probability.` |
`* Systems - operating system level, concurrency, threads, processing, memory.` |
`### Software Engineering: The Full Topics List` |
`A much longer and fuller list of topics:` |
`* Algorithm complexity` |
`* Data structures` |
` * Arrays` |
` * Linked lists` |
` * Stacks` |
` * Queues` |
` * Hash tables` |
` * Trees` |
` * Binary search trees` |
` * Heap trees` |
` * Priority queues` |
` * Balanced search trees` |
` * Tree traversal: preorder, inorder, postorder, BFS, DFS` |
` * Graphs` |
` * Directed` |
` * Undirected` |
` * Adjacency matrix` |
` * Adjacency list` |
` * BFS, DFS` |
` * Built-In Data Structures` |
` * Java Collections` |
` * C++ Standard Library` |
` * Sets` |
` * Disjoint Sets` |
` * Union Find` |
` * Advanced Tree Structures` |
` * Red-Black Trees` |
` * Splay Trees` |
` * AVL Trees` |
` * k-D Trees` |
` * Van Emde Boas Trees` |
` * N-ary, K-ary, M-ary Trees` |
` * Balanced Search Trees` |
` * 2-3 Trees, 2-4 Trees` |
` * Augmented Data Structures` |
`* Algorithms` |
` * NP, NP-Complete, Approximation Algorithms` |
` * Searching` |
` * Sequential search` |
` * Binary search` |
` * Sorting` |
` * Selection` |
` * Insertion` |
` * Heapsort` |
` * Quicksort ` |
` * Merge sort` |
` * String algorithms` |
` * String search methods` |
` * String manipulation methos` |
` * Recursion` |
` * Dynamic programming` |
` * Computational Geometry` |
` * Convex Hull` |
`* Object Oriented Programming` |
` * Design patterns` |
`* Bits and Bytes` |
`* Mathematics` |
` * Combinatorics` |
` * Probability` |
` * Linear Algebra` |
` * FFT` |
` * Bloom Filter` |
` * HyperLogLog` |
`* Systems Level Programming` |
` * Processing and threads` |
` * Caching` |
` * Memory` |
` * System routines` |
` * Messaging Systems` |
` * Serialization` |
` * Queue Systems` |
`* Scaling` |
` * Parallel Programming` |
` * Systems Deisng` |
` * Scalability` |
` * Data Handling` |
`* Crypto and Security` |
` * Information Theory` |
` * Parity and Hamming Code` |
` * Entropy` |
` * Hash Attacks` |
`* Unix` |
` * Kernel Basics` |
` * Command Line Tools` |
` * Emacs/Vim` |
`* Supplemental topics` |
` * Unicode` |
` * Garbage Collection` |
` * Networking` |
` * Compilers` |
` * Compression` |
` * Endianness` |
`## TODO Machine Learning` |
`Have 1 repo, with a github pages.` |
`HTML landing page with info about each topic.` |
`Notebook for each overarching topic,` |
`split into multiple notebooks as needed.` |
`For exmaple, a notebook to compare ridge and lasso.` |
`- [ ] Regression: linear regression` |
`- [ ] Regression: logistic regression` |
`- [ ] Regression: OLS regression` |
`- [ ] Regression: Stepwise regressoin` |
`- [ ] Regression: MARS` |
`- [ ] Regression: LOESS` |
`- [ ] Instance: k-Nearest Neighbor` |
`- [ ] Instance: Learning Vector Quantization` |
`- [ ] Instance: Self-Organizing Map` |
`- [ ] Instance: Locally Weighted Learning` |
`- [ ] Regularization: Ridge regression` |
`- [ ] Regularization: LASSO` |
`- [ ] Regularization: Elastic net` |
`- [ ] Regularization: LARS` |
`- [ ] Decision Tree: classification tree` |
`- [ ] Decision Tree: CHAID` |
`- [ ] Decision Tree: conditional decision trees` |
`- [ ] Bayesian: LARS` |
`- [ ] Bayesian: Naive Bayes` |
`- [ ] Bayesian: Gaussian Bayes` |
`- [ ] Bayesian: Multinomial Naive Bayes` |
`- [ ] Bayesian: Bayesian Network` |
`- [ ] Bayesian: Bayesian Belief Network` |
`- [ ] Clustering: k-Means` |
`- [ ] Clustering: k-Medians` |
`- [ ] Clustering: Expectation Maximization` |
`- [ ] Clustering: Hierarchical Clustering` |
`- [ ] Dimensionality Reduction: PCA` |
`- [ ] Dimensionality Reduction: t-SNE` |
`- [ ] Dimensionality Reduction: PLS` |
`- [ ] Dimensionality Reduction: Multidimensional Scaling` |
`- [ ] Dimensionality Reduction: Principal Component Regression` |
`- [ ] Dimensionality Reduction: Discriminant Analyses` |
`- [ ] Association Rule: Apriori algorithm` |
`- [ ] Deep Learning: CNN` |
`- [ ] Deep Learning: RNN` |
`- [ ] Deep Learning: LSTM` |
`- [ ] Deep Learning: DBM` |
`- [ ] Deep Learning: DBN` |
`- [ ] Deep Learning: Stacked Auto-Encoders` |
`## TODO Software Engineering ` |
`- [X] Algorithm complexity and big-oh notation ` |
` - [https://charlesreid1.com/wiki/Algorithm_complexity](https://charlesreid1.com/wiki/Algorithm_complexity)` |
` - 05/26` |
` - 05/27` |
`- [ ] Arrays` |
`- [ ] Linked lists` |
`- [ ] Stacks` |
`- [ ] Queues` |
`- [ ] Hash tables` |
`- [ ] Trees: binary search trees` |
`- [ ] Trees: heap trees` |
`- [ ] Trees: priority queues` |
`- [ ] Trees: balanced search trees` |
`- [ ] trees: red black trees` |
`- [ ] Trees: tree traversal` |
`- [ ] Graphs: directed and undirected` |
`- [ ] Graphs: graph <--> adjacency matrix/list ` |
`- [ ] Graphs: BFS, DFS` |
`- [ ] Algorithms: NP, NP-Complete, Approximation` |
`- [ ] Search Algorithms: Sequential search` |
`- [ ] Search Algorithms: Binary search` |
`- [ ] Algorithms: Selection sort` |
`- [ ] Algorithms: Merge sort ` |
`- [ ] Algorithms: Quick sort ` |
`- [ ] Algorithms: Heap sort ` |
`- [ ] Algorithms: String search methods` |
`- [ ] Algorithms: Recursion` |
`- [ ] Algorithms: Dynamic programming` |
`- [ ] Algorithms: Convex hull` |
`- [ ] Algorithms: Computational geometry` |
`- [ ] Object oriented design: basics` |
`- [ ] Object oriented design: inheritance diagrams` |
`- [ ] Object oriented design: polymorphism` |
`- [ ] Object oriented design: design patterns` |
`- [ ] Bits and Bytes` |
`- [ ] Mathematics: Combinatorics` |
`- [ ] Mathematics: Probability` |
`- [ ] Mathematics: Linear Algebra (computational)` |
`- [ ] Mathematics: FFT` |
`- [ ] Systems: Processing and Threads` |
`- [ ] Systems: Caching` |
`- [ ] Systems: Memory` |
`- [ ] Systems: System routines` |
`- [ ] Systems: Messaging systems` |
`- [ ] Systems: Serialization ` |
`- [ ] Systems: Queue systems` |
`- [ ] Scaling: Systems design` |
`- [ ] Scaling: Scalability` |
`- [ ] Scaling: Data handling` |
`- [ ] Parallel programming: Basic concepts/algorithms` |
`- [ ] Crypto Security: Information Theory` |
`- [ ] Crypto Security: Parity and Hamming Code` |
`- [ ] Crypto Security: Entropy` |
`- [ ] Crypto Security: Birthday/Hash Attacks` |
`- [ ] Crypto Security: Public Key Cryptography Math` |
`- [ ] Supplemental: Unicode` |
`- [ ] Supplemental: Garbage Collection` |
`- [ ] Supplemental: Networking` |
`- [ ] Supplemental: Compilers` |
`- [ ] Supplemental: Compression` |
`- [ ] Supplemental: Endianness` |

`## The Plan` |
`### Tracks` |
`We are following two tracks:` |
`* Software Engineering Track` |
`* Machine Learning Track` |
`Software engineering track:` |
`* Paper and pencil working out algorithms (see binder)` |
`* Wiki: distilled, polished notes: see [https://charlesreid1.com/wiki/CS](https://charlesreid1.com/wiki/CS)` |
`* Git: to-to list for topics: see [software engineering to do list (below)](#software-engineering-to-do-list)` |
`* Git: code practice: see [https://git.charlesreid1.com/cs/java](https://git.charlesreid1.com/cs/java)` |
`* Flashcards` |
`Machine learning track:` |
`* Paper and pencil notes (rough), problems (working out), thinking` |
` * Note: following Alpaydin book, working through problems` |
`* Wiki: distilled, polished notes and learnings` |
` * Summary of major concepts` |
` * Answers/examples worked out more clearly` |
` * Fast notes, for studying, not presentation, so snap photos and upload` |
`* Git: to-do list for topics` |
`* Git: code practice` |
`* Flashcards` |
`### Daily Plan` |
`Each day:` |
`- Pick one subject from the list.` |
`- Watch videos on the topic.` |
`- Implement the concept in Java or Python.` |
`- Optionally, implement in C (and/or in C++, with or without the stdlib).` |
`- Write tests to ensure code is correct.` |
`- Create flashcards` |
`After one week:` |
`- Revisit and review` |
`Long term strategy:` |
`- Practice coding until you are sick of it.` |
`- Add flashcards` |
`- Work within limited constraints (think interviews).` |
`- Know the built-in types.` |
`Code:` |
`- [Java](https://git.charlesreid1.com/cs/java)` |
`- [Python](https://git.charlesreid1.com/cs/python)` |
`- [C](https://git.charlesreid1.com/cs/c)` |
`- [C++](https://git.charlesreid1.com/cs/cpp)` |
`Practice writing out on a whiteboard and/or on paper, ` |
`before implementing on computer.` |
`Get a big drawing pad from the art store.` |
`See [checklist](#checklist) below for the checklist of completed tasks.` |

