Modeling

Select modeling technique

Prompt engineering

On a first step I should try using prompt engineering. We should get a good baseline just by asking an LLM to look at the original text, the rewritten text and ask what the prompt could have been.

Also giving some examples could boost the scores even further.

Model fine-tuning

Being able to fine-tune the model using new tokens could teach the model to learn a new task. This has the potential to achieve even better results.

Model inversion

We already know which model was used to generate the text, can't we reverse the process to fill in the middle?

Generate experimentation design

The lack of data makes evaluation very difficult. We don't know if our generated data follows a similar distribution to the test dataset.

The public leaderboard uses only 15% of the data, so it's just 195 samples.

Last update: 2024-03-18