Maybe the problem is that I’m too close to the specific problem. AI tooling might be better for open-ended or free-association “why not try glue on pizza” type discussions, but when you already know “send exactly 4-7-Q-unicorn emoji in this field or the transaction is converted from USD to KPW” having to coax the machine to come to that conclusion 100% of the time is harder than just doing it yourself.
I, too, work in fintech. I agree with this analysis. That said, we currently have a large mishmash of regexes doing classification and they aren’t bulletproof. It would be useful to see about using something like a fine-tuned BERT model for doing classification for transactions that passed through the regex net without getting classified. And the PoC would be would be just context stuffing some examples for a few-shot prompt of an LLM and a constrained grammar (just the classification, plz). Because our finance generalists basically have to do this same process, and it would be nice to augment their productivity with a hint: “The computer thinks it might be this kinda transaction”
This is a solvable problem. Just make a LoRA of the Alice character. For modifications to the character, you might also need to make more LoRAs, but again totally doable. Then at runtime, you are just shuffling LoRAs when you need to generate.
You’re correct that it will struggle to give you exactly what you want because you need to have some “machine sympathy.” If you think in smaller steps and get the machine to do those smaller, more do-able steps, you can eventually accomplish the overall goal. It is the difference in asking a model to write a story versus asking it to first generate characters, a scenario, plot and then using that as context to write just a small part of the story. The first story will be bland and incoherent after awhile. The second, through better context control, will weave you a pretty consistent story.
These models are not magic (even though it feels like it). That they follow instructions at all is amazing, but they simply will not get the nuance of the overall picture and be able to accomplish it un-aided. If you think of them as natural language processors capable of simple, mechanical tasks and drive them mechanistically, you’ll get much better results.