Published on

Your data can tell you a lot about the type of ML you could do

Authors

"Text Data (build chatbots for customer service, …):

  • Lots of data: can make use of DP. A robust data engineering infrastructure around the data should be designed. Modeling requires highly specialized people in the domain. GPU machines should increase modeling speed

  • Little amount of data: Should probably explore first pre-trained DL models. Engineering infrastructure is less important. Modeling could be performed by a less specialized workforce. GPU machines may not be necessary

Image Data (face recognition for security systems, augmented reality systems, ...):

  • Lots of data: DL will probably generate performance beyond anything traditional techniques could produce. A robust data engineering infrastructure around the data should be designed. Modeling requires highly specialized people. GPU machines are a must.

  • Little amount of data: Pre-trained DL Models could yield satisfactory results but should possibly question investing in ML in general for Computer Vision applications. Engineering infrastructure around the data is less important. Modeling could be performed by less specialized work force. GPU machines remain important for improved modeling speed.

Time Series Data (sales forecast, stock price prediction, ...):

  • Lots of data: Traditional methods like XGBoost will generally yield greater performance in time series data. A robust data engineering infrastructure around the data should be designed. Modeling could be performed by generalist data scientists. GPU (for Transformers or LSTM, for example) and CPU machines could leveraged.

  • Little amount of data: It is potentially not a problem to be solved with ML techniques. Engineering infrastructure around the data is less important. Modeling could be performed by a less specialized workforce. GPU machines are most likely unnecessary.

Tabular Data (product recommendation, customer churn prediction, ...):

  • Lots of data: traditional ML techniques usually outperform Deep Learning. However, In the case of product recommendation with very sparse variables, DL has proven to bring superior performance. A robust data engineering infrastructure around the data should be designed. Modeling could be performed by generalist data scientists. GPU may not be very useful as Deep Learning are less relevant in this case (apart in the case of Rec Engines).

  • Little amount of data: It is probably not a problem to be solved with ML. Should possibly reconsider investing in advanced analytics. A robust data engineering infrastructure around the data is less important. Modeling could be performed by less specialized work force. GPU machines are most likely unnecessary. "

Your data can tell you a lot about the type of ML you could do

Author

Ai Base Network (ABN), ABN ASIA was founded by people with deep roots in academia, with work experience in the US, Holland, Hungary, Japan, South Korea, Singapore, and Vietnam. ABN Asia is where academia and technology meet opportunity. With our cutting-edge solutions and competent software development services, we're helping businesses level up and take on the global scene. Our commitment: Faster. Better. More reliable. In most cases: Cheaper as well.

Feel free to reach out to us whenever you require IT services, digital consulting, off-the-shelf software solutions, or if you'd like to send us requests for proposals (RFPs). You can contact us at [email protected]. We're ready to assist you with all your technology needs.

ABNAsia.org

© ABN ASIA