15:30 - 16:15  |  AI & Data Science

Feed Hungry AI Systems More Data, Faster & Efficiently

Thursday 24 September 2020


ML models run better with more data; and the more iterations and the more training, tuning and validation you can do, the better your results. So what’s the big deal?

It can often be about speeding up e.g. Spark jobs for operational efficiency. The faster a job runs, the more jobs can be run on the same cluster, maximising returns on time and investments significantly. This in turn reduces the training and inference times significantly.

Data prep is painful. The problem is having an online system with streaming data and the need to make an inference in milliseconds. The problem is wanting to pull disparate signal data from sources from different countries and data centres in real-time. 

AI/ML systems have insatiable appetites for data.

In this session, we’ll cover how to:

  • Ingest high velocity, high volume at the edge and with “multi-site clustering”
  • Reduce training and inference times
  • Accelerate Spark jobs with massive parallelisation
  • Create a high throughput and low latency streaming pipeline
  • Use fewer resources overall

If you have already registered, login using your email and password.