Discussion about this post

User's avatar
Houfu's avatar

The response rating is very thorough but it reads like a lot of work which is difficult to scale unless you have a substantial team. I'm not sure if this helps but I would have come up with an ideal answer to each question (that scores all the factors you were looking at) then NLP (n-grams blah blah) the answers given by each model to see how close they are and give it an overall rating.

Expand full comment
1 more comment...

No posts