Skip to main content

Hi everyone!

 

I'm so excited to kick off our first Data People Community competition, learn and have some fun together🎉

 

Now...ready for the first challenge?

 

Today's challenge is all about discussing the immense opportunities, benefits, and strengths that Generative AI and LLMs bring to our work. We want to hear your valuable insights!

 

What are your thoughts on the most significant opportunities, benefits, and strengths of Generative AI, LLMs to your day-to-day or long-term work?

 

Share your opinion in the comments below for a chance to win a $50 Amazon voucher 🎁

 

We'll be announcing today's winner tomorrow, challenge number 2 will be live then also. One winner will be chosen at random. The competition ends on Monday, June 5, 11 pm PT, and the winner will be announced the following day. You can find the T&Cs for today’s challenge attached.

 

We're excited to see your contributions!

 

Happy posting ✍️

Currently I'm working on a hobby project that uses ChatGPT to categorize astronomy news items, so that I can more easily find certain items when I'm preparing a presentation on that topic. What I've learned quite quickly is that the output ChatGPT (and probably other LLMs as well) produces, can be of quite inconsistent format. For example, when I asked it to give me 1-6 terms to tag a news item with, it sometimes delivered comma separated results, sometimes new line separated, sometimes lines started with dashes, sometimes the result was all caps.

So you really need to be very clear how you want your result. It is possible to do so. And even then there might be variations. It's almost like I outsourced my request to a team of humans, each with their own way of working.

When you start using LLM's in data pipelines, you really need to be strict in the prompt you write, decide on the “temperature” of the model (how much variation you want) and you really need to check the data quality afterwards.


Here are 3 major opportunities: 

  1. Better data accuracy and completeness,
  2. Improved data speed and efficiency
  3. Enhanced data insights and intelligence

 

In my day-to-day work, I can see how Generative AI and LLMs could be used to improve the quality of the data that I work with. For example, I could use these technologies to identify and correct errors in data, as well as to fill in absent data. This would allow me to make better conclusions and improve the value of the data I work with.


Currently I'm working on a hobby project that uses ChatGPT to categorize astronomy news items, so that I can more easily find certain items when I'm preparing a presentation on that topic. What I've learned quite quickly is that the output ChatGPT (and probably other LLMs as well) produces, can be of quite inconsistent format. For example, when I asked it to give me 1-6 terms to tag a news item with, it sometimes delivered comma separated results, sometimes new line separated, sometimes lines started with dashes, sometimes the result was all caps.

So you really need to be very clear how you want your result. It is possible to do so. And even then there might be variations. It's almost like I outsourced my request to a team of humans, each with their own way of working.

When you start using LLM's in data pipelines, you really need to be strict in the prompt you write, decide on the “temperature” of the model (how much variation you want) and you really need to check the data quality afterwards.

Hi, I have the same experience that the output can be sometimes inconsistent. What helped me was to provide a few examples (containing the correct output format) as part of the prompt + retry the query if it failed to be parsed.

I like how you look at LLMs like on humans when thinking about that to expect. This is something unusual when we are talking about algorithms or ML but I think it can be beneficial in a lot of situations when dealing with LLMs.


Reply