Sponsor Presentations: Avoid the top 5 web data pitfalls when developing AI models (Sponsor: Bright Data)

Thursday - May 16th, 2024 3:30 p.m.-4:30 p.m. in Room 308

Presented by:


Data Bias: ensuring that the training data is not biased. Biased data can lead to AI models that are unfair or discriminatory. For example, if a dataset for facial recognition software predominantly contains images of people from certain ethnic groups, the model may perform poorly on faces from underrepresented groups. Insufficient Data Variety: AI models require diverse data to understand different scenarios and variations. If the training data is too homogeneous or lacks variety, the model might not perform well in real-world, diverse conditions. Overfitting and Underfitting: Overfitting occurs when a model is too complex and learns to fit the training data so closely that it fails to generalize to new data. Underfitting happens when the model is too simple to capture the underlying patterns in the data. Poor Data Quality: If the training data is full of errors, inconsistencies, or is poorly labeled, the AI model will likely inherit these flaws. Ensuring high data quality is essential for developing reliable and accurate AI models. Ignoring Data Drift: Over time, the real-world data that an AI model encounters may change or 'drift' from the data on which it was trained. This can happen due to evolving trends, behaviors, or environments. Failing to monitor and adapt to these changes can render an AI model less effective or even obsolete.