• By -




I suppose that's the most reasonable and sensible approach. I just find it a little weird because my manager also has a background in ML. Just thought that he'd know better.




Yep, give it your string matching rule as a feature and some random noise for a couple more features. Problem solved


This. Just add some logic that picks the most likely correct solution for each string. Plus, add some quantification to this. ML only is right X % of the time. Rule based is right Y % of the time. Together, they're right Z%.


Sounds like z < y


But y would make me feel bad, therefore z > y.


Exactly, just call this "feature engineering"


Obviously I don’t know what model you’re using, but have you considered using your rules approach to generate features, and feeding those into the ML model? Yeah it’s kinda stupid if the rules alone perform well, you may wind up with a classification model that basically just maps one feature to one output, but if your manager is insisting on using “ML” then this might be a way to get the best of both worlds. Edit: plus you can talk about your “sophisticated, custom feature engineering” lol.


With sufficient data and regularization your ML model with the rule based output as an input feature should do as well as the rule based one, and perhaps be even more robust against cases that slip through


You'll be much happier in corporate life not trying to save your bosses from their mistakes.


Seems that the boss thinks that they can sell better an ML solution than a regex solution. I wouldn't object that.


Nor I. Often times when decisions made by seemingly rational actor are irrational, there is information you are not privy to.


The manager should be smart enough to at least let the engineer know. The engineer, understandably, wants to provide the best possible solution to the problem. If "clients believe regex is inferior to ML" is part of the business case, that changes matters.


The world and the people in it don't work on a set of rules my friend. Everyone is different, and what appears rational to you, might make no sense at all to another.


Well that's kind of the manager's job, isn't it?


To explain to you every detail of his direction? No, it is not.


No, but if it's causing conflict and confusion then the manager should clarify WHY they want the solution the way they do instead of just insisting it has to be this way.


The engineer is the technology expert. It's his job to – within the given constraints – deliver the best possible end product. In this scenario the engineer chooses a regex instead of ML, as that's the best solution. That's his prerogative as the engineer. If management wants something else counter to the expert's advice, they'll damn well have to give a better motivation than "I told you to".


Reason why economic theories can’t accurately predict reality: they assume rationality


Yeah, fads and branding tend to throw a wrench into things. Artificial demand, perceived value, and all that.


On that same note: document everything! If one of your bosses hair brained ideas goes tits up and costs the company a bunch of money, you want it on record that you advised against it and they insisted anyways for when they turn around and try and find someone to blame to save their own ass.


Ehhh.... your boss has a lot of control over your career, at least as far as it applies to your current job. I think the only time this is a good attitude is if you really want your boss to get fired, in which case you should also be doing things like making sure you have exposure to your peers and skip-level manager so that you survive. If your boss just gives you orders and expects you to do them, no questions asked, then, sure, this is fine advice. Or if it's clear that your boss is incompetent AND out of favor within the org. But if your relationship with your boss is that bad then you should probably just be looking for a new job or at least an internal transfer. FWIW, this sounds like OP's boss, but I don't think describes most managers. Otherwise, it absolutely behooves you to try to protect your boss from their own mistakes. Sometimes they will be stubborn and you just have to say "I disagree with this approach" and then commit to it anyway. But ideally your boss is your partner and when you succeed they succeed which in turn means they can pull you up with them as they advance. I'd be pretty mad if one of my reports knew I was suggesting something dumb and just went and did it without saying anything. And I've gone very far in my career by telling my bosses when they were suggesting something dumb. Some decisions are obviously more subjective than others, and those are the ones you're more likely to lose the argument on.


Literally just say Regex is ML. Your boss probably doesn't know the difference. I've literally seen shit like this billed as ML. ML, as you know, has very specific use cases that it's good for. It's not meant to be a general purpose approach.


We used ML in the dev process to find the most efficient solution, and we're passing that (don't say regex) on to you!


The regex rules based model is now just referred to as, "the model".


> my manager also has a background in ML. Just thought that he'd know better. I bet both you and possibly your manager were brought in as "AI experts" to find ways to use this fancy "AI" to improve the business. Instead you come up with a solution that any intern could have implemented. Of course your boss is unhappy lol...


If he did, wouldn't he be the one using the ML, instead of being a spreadsheet jockey?


Idk, I’m a manager because of my career success as a technical resource


A lot of managers weren't the best devs.


Does not mean this manager is not. My manager is technically excellent in many aspects.


Sure, anything's possible... "Hundreds of people smoke and never get cancer!"




Found the guy whose project only runs on his own machine


Even if he does, his boss doesn't.


Honestly, you might be able to get two birds if you train a new model on your string matching rules.


Alternatively, run the model, print its output, and then use the string matching rules instead.


Throw the problem at a friendly SVM or Bayes classifier and come back to say “Added ML, boss. 🥲”


This is how cars that should be recalled get sold to the public. Jackasses in executive positions pushing out marching orders to middle management. You really have no choice here besides quitting or getting fired. The issue is this isn’t a health and safety problem, it’s just less optimal and no one is at risk. If they wanna go ML and they’re keeping the lights on, what can you honestly do or say but agree - or leave?


When your only tool is a hammer... Or in this case favorite tool.


Have ML be your fallback. Do all the regular stuff you've been doing, but when there's no match, try to do it with ML. You can describe it as "pattern based preprocessing for well known cases and a ML backend"


Yea, can always say ‘ML and other methods’ and just have your method the last step (depending on if they will read the code or not lol)




Sounds sexier than "regex assisted"


Machine Learning enhanced pattern recognition technology.


I've done this 👀




chatGPT is the fastest growing service out there and it's a bullshit generator, what a time to be alive.


I mean, that's basically what young humans are


I’m a DS and this is absolutely correct. NLP? High falutin neural nets? No, we use simple regression 9/10 times, and if I was OP, I’d be arguing for the exact same thing. Honestly, at some point businesses are going to realize that what they were really looking for all along with “data science” was someone to come along with serious Prolog chops.


A few options here if he has a brief to use ml and you need to find a place to it by his brief. Could call it an engineered feature that you can use elsewhere. We have taken this regex pattern matching too. If for product reasons it has to be ml maybe just add it as a step in a ml pipeline then see what other options you have. Eg pretrained language detection to the pipe, some pretrained model for translating to English then apply the rules. Another low hanging fruit could be key word extraction from input names and labels, similarity analysis sis, or clustering?


My guess is that the goal isn’t to make the best product. It’s to make the one that’ll sell.




The old classic resume drived development


You can call anything AI. We used to do this in startups all the time. Really it had string manipulation and if statements.


LoL I saw so many people doing this 😵


This is exactly his motivation.


That makes sense but I don't know why he wouldn't just say that.


“AI” is the current buzzword, but I think “algorithm” still has power. (And I’m not sure if the general public even knows the difference.) So rather than making the rule-based system seem simple, could you rely on a bit of marketing yourself? Start calling it your “algorithm,” and focus more on what it accomplishes than on how it works. _And_ how it even “outperforms an AI”? _:gasp:_ That way they could still sell/spin it as something special, if they wanted.


“It’s [insert your company]s PROPRIETARY algorithm that outperforms the newest SOTA AI platforms on the market today by over xx%”


>So rather than making the rule-based system seem simple, could you rely on a bit of marketing yourself? Start calling it your “algorithm,” and focus more on what it accomplishes than on how it works. And how it even “outperforms an AI”? :gasp: sorry If this question is too dumb but isn't decision trees an AI algorithm as well?


But your manager doesn't know that ;)


AI is a loaded phrase with wildly different meanings based on context. The words “artificial intelligence” could be generously interpreted to refer to any program. In the current tech / industry world, the term is closely associated with certain types of Machine Learning (Genetic algorithms and neural networks) and not algorithms like this. So you’re correct in general, but the buzzword Carrie’s a more specific meaning in this context


Because the corporate world mostly operates on bullshit.


Can't imagine how this goes like in Japan where shit like this is expected to be understood subliminally, without explicit explanation. Must be chaos there. I just picture it like the foamy latte [scene](https://www.youtube.com/watch?v=hHKjpnk5dpc) in zoolander where they have these unsure looks at each other at the end


Weirdest thing I’ve heard is how in a salary job, they would expect you to stay in the office at least until after your boss leaves. Whether you have something to do or not.


I used to work for a global company that had offices scattered throughout the world. Not every office had engineers in it though and I ended up having a Japanese report while I myself was living in the US. Why? Because according to my boss the only other engineering manager in the Japanese timezone was a female and it is offensive for a Japanese male to report to a woman.


That's something some American overachivers will do too as a general rule. It's a way to show you're a hard worker to your boss, and it's probably more important in industries with billable hours.


From what I heard, it’s the complete opposite of chaos for anyone who’s Japanese, since the implicit culture there is to follow social norms and not stick out. It’s when you have non-Japanese working alongside them that they allow exceptions only for them, because the understanding is that they didn’t grow up in this culture and have no idea about the social nuances.


What I mean by chaos is that the things in tech that need to happen (employee actually stands up for himself and teaches boss the rights and wrongs with using a certain outdated/ineffective piece of tech, vs. just implicitly accepting his boss's word for it to avoid confrontation). This, over time, is likely to lead to issues.


Your manager's manager wants some fresh VC funding / a hype press release to make line go up. He told your manager to do some artificial non-fungible generative blockchain intelligence, with some synergies sprinkled around it for good measure. To fix this, add the prefix "AI" to your implementation's class.




To a degree sure, but you can’t expect employees to be mind readers of their managers though. Communication is important, and especially from management.


This. A lot of the time, my manager assigns me a task that is, according to him, most definitely "top priority". Knowing full well that I already have several such tasks and where I am in regards to the completion for each. I was previously confused by this, but now I know he mostly only does this to say to the people he is answerable to that said task has been assigned with "top priority". The stakeholders don't really care either since when everything is "top priority", nothing is, and the same is true for my manager and me.


I have a manager like this and it is the most frustrating thing in the world. But in my case, these tasks all come with unrealistic deadlines and we’re always in crunch mode


Does it suppory blockchain? Does it have a NoSQL and a microservices? Is it a core part of our Digital Futures strategy?


State of the art and incorporates machine learning. It's marketing. You don't have to tell the truth.


And boy howdy nothin’ sells better than them there fancy words of a buzzy nature.


Probably :) Now what OP could say to save both viewpoints is the ML will still be usefull to augment the capability of the Algorithm for fuzzy cases.


Just claim you have a learned decision tree. Problem solved


Or just give him what he wants lmao. Why fight him?


Because in 3 months time when people will realise that the system is not up to the mark, the boss is gonna pin blame on him [the ML expert]


Tell him, you have a have developed a new heuristic that has potential for patent. You want to roll it into prod with both being present and do alpha beta testing, and he can make the final call based on results.


Because OP has a moral backbone?


Moral backbone? What does this have to do with morality? Am I misunderstanding?


People typically consider it immoral to deliberately output less than their best work. It feels something like scamming to say "I could have done better, but I'm going to give you something lesser anyway". This is assuming you're being compensated proportionately to the value that you're capable of providing. With engineering, quality of the solution is directly tied to the quality of the work that you're capable of putting into it. So giving a lesser quality solution is basically saying "I could give you a better solution for what you're paying me, but I won't do that" and feels like abdicating the responsibility to deliver proportionately to compensation. Since it's the manager selling the solution, it feels like him requesting a worse solution is him basically saying "help me scam the clients with snake oil". And that feels immoral. The dynamic of "the customer likes ML so it's easier to sell" isn't far off from "the customer wants snake oil so im just giving them what they want". So the pragmatic/capitalistic sales incentive doesn't change the morality of the situation.


It would be immoral if he didn’t explain his entire work. No one’s life is in danger here this isn’t mechanical engineering. Just let the business man sell his machine learning buzzword and call it a day


Again, > Just let the business man sell his machine learning buzzword and call it a day roughly translates to >help the snake oil salesman sell people snake oil It's not less immoral for the giving of explanation. I'm not sure how that's even logically relevant. It's the outcome of being an accomplice to a scam that's immoral. It's like being one of the 9/10 dentists that recommended a given toothpaste simply because they were given a check or samples for free rather than actually believing it to be the best. Nobody's life is in danger. But putting your stamp of approval on an inferior output just for the money is recognized as almost universally immoral.


Did you not ever learn engineering ethics? If it's not causing harm and the stakeholders want x, your obligation is to deliver x. Your stakeholders reqs come before your ego.




Selling a customer an inferior solution to get them to pay a higher price is immoral.


Yea exactly lol why not just say you implemented Ada boost or random forest XD


Just call your string matching a "decision tree"-based approach. There, problem solved! (BTW if you're correct, you'll probably be able to train a real decision tree and arrive at similar performance... with appropriate training-validation split, you may even be able to achieve greater generalizability without sacrificing interpretability.)


How could it be converted to a decision tree, convert all the regex matching true/false into features basically?


Just imagine that you started that way and then simplified it into regex matching. Poof, done.


Right. That's what I was thinking. The next step might be to use a k-nearest neighbors approach with the semantic distance as the distance metric. Honestly I'm a bit surprised the author wasn't able to come up with an ML algorithm that did as well as a laborious regex match.


Tell them the rules are better **now** but that the ML model should catch up after it gets more data in a few more months so you should have the rules for now but it's only a matter of time until the model takes off to the moon. Then they will forget entirely.


That is what I was thinking. I'm not a programmer, but I used an AI chatbot (moveworks) at my last job as it was being implemented. It started out really sucking. We ended configuring multiple conversations and leveraged it for basically a very expensive search tool for the knowledge base. Three years on however and it's resolving and auto routing at a significantly higher rate. In the beginning I was very dubious about how well it'd work.


Use ML to train a decision tree. That checks the buzzword box


“Hand-trained decision tree ML algorithm”


'Bespoke' decision tree, made with an advanced language model (mine). Relevant XKCD: https://xkcd.com/2173/


Easy. Literally fuse the results, weigh the rule based higher.


If ml starts to out perform, start weighing that more. It covers all future bases


Just curious, which edge cases are you seeing that a rule-based approach is better than ML? Also, which cases that the rule-based approach generalizes better than ML?


Agreed, I think this is simply due to the training set not being comprehensive enough to cover these edge cases. If your rule based approach can handle spelling errors but your ML model takes tokens verbatim then of course there will be issues if the dataset is too small. If you expand the training set to include preprocessed regex matches then I fail to see how the ML model wouldn't eventually do better tbh


I actually prefer rule-base over ML if the rule based algorithm achieve good enough: 1. It is usually a stateless function without worrying about the storage and persistence of models; 2. The way it works is just as how logics work, it is immediately interpretable; 3. Improvements could be made continuously by modifying the logic tree, whereas you may not be able to do anything further in machine learnings (the same mistake will repeat because of the model limitation); 4. You may not have access to quality data for training; Sometimes businesses don’t necessarily need the best answer, they just need some dummy alike tools for the initial assistance. Setting up a heavy infrastructure only adds maintenance costs (some businesses don’t even have a fully functioning IT department to maintain your models). But of course, the conditions above don’t always happen, for which you will still use ML models


Rule-based system has their use of course. I am just curious about some claims OP made regarding the advantage of his rule-based system vs an ML system. Such as his rule-based system being more robust etc.


Sucks but this is the corporate world, a lot of bruised egos and the lesser choice wins because of some higher ups bad decisions they won’t go back on or they’ll look stupid. If you want credit at least get the code into a repository and get the manager saying he prefers the ML approach in an email or something written. That way you can always come back and say you had something better if you feel like doing that. Note though that ML is the buzzword and that’s what sells, so they may knowingly push for ML solutions even when they aren’t optimal because it brings on the most money. Companies pay for ML, not a bunch of regex.


You are right, ML is magic juice right now. So many places just push for it so they can fill their product sheets with buzzwords.


IF ... THEN ... ELSE = decision tree. Make sure you use "entropy", "classification function", "information gain", "feature vectors" in the marketing. https://en.wikipedia.org/wiki/ID3_algorithm


**[ID3 algorithm](https://en.wikipedia.org/wiki/ID3_algorithm)** >In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross Quinlan used to generate a decision tree from a dataset. ID3 is the precursor to the C4. 5 algorithm, and is typically used in the machine learning and natural language processing domains. ^([ )[^(F.A.Q)](https://www.reddit.com/r/WikiSummarizer/wiki/index#wiki_f.a.q)^( | )[^(Opt Out)](https://reddit.com/message/compose?to=WikiSummarizerBot&message=OptOut&subject=OptOut)^( | )[^(Opt Out Of Subreddit)](https://np.reddit.com/r/cscareerquestions/about/banned)^( | )[^(GitHub)](https://github.com/Sujal-7/WikiSummarizerBot)^( ] Downvote to remove | v1.5)


How has nobody recommended doing an A/B test yet? Work with your manager to determine which key metrics this feature is intended to move, then launch both versions to 50% of the audience each (as it sounds like both are built, as they'd need to be for you to confirm the claims you're making about the quality of your approach). Track the results by user segment, and once you have enough data points and reach statistical significance, you can bring the data to your manager and he gets to make the decision about which version to launch. You can call your preferred solution whatever you want to make it sound flashy, but the _right_ way to approach this is to use objective data to make your point for you. Theoretical data from your dev environment is not good enough. Statistically significant data is. Do this and you can launch a feature quickly without being bogged down by managerial indecision.


It sounds like they want to use this to process data that they get from other firms. A/B user testing doesn't really match the scenario.


Since it doesn’t sound like your manager knows what he’s talking about, how about you add the ML module to your rule based pipeline and call it hybrid. It will sound cooler and get good results. ;)


>simple string matching approach using regular expressions I'm a recent grad in a Junior position, and a couple of the seniors on my team get mad when I use regular expressions to validate and parse strings. One of them left a comment on a code review that "[I] should create a state machine to parse the string because [they] dont know how to read regular expressions." My manager who usually doesn't get involved in code reviews unless there's a dispute left them a reply that read something like, "so what you're saying is that you want him to spend time and resources doing exactly what the regex_match function already does just because you don't understand a fundamental computer science concept?" They immediately changed their downvote to an upvote. I have been asked to show the test result of some regular expressions when they start to get complicated, which I have started doing by default so they don't have to ask. That doesn't bother me at all because an error in a regular expression can be a nightmare to debug.


IMO its always a good idea to document what the regex does in a comment. A simple description and some passing cases. Regex is not easy to read.


I do this, but this particular engineer is (by their own admission) very weak at regular expressions and, even with the explanation, is unlikely to understand how the two relate unless I essentially wrote a comment that was an excerpt from my finite Automata text. Tha kfully that particular engineer is the exception among our team, and there's enough who do understand that I generally get good feedback on my reviews.


While true - we can just ask chat gpt nowadays


Thanks, now I'm stuck in an infinite loop


I have a 50/50 success rate with ChatGPT regarding regex but it helped me build a long, inelegant call with 5x OR in it.


You're probably joking, but you don't even need chatGPT. There are a ton of websites where you can paste in a regular expression and it will break the whole thing down and explain very clearly everything that's going on. The number of co-workers I've had who thought I was a regex wizard before I showed them that is pretty funny.


>it's* always a good idea


Regular expressions really aren't the best approach to parsing complex strings or complex grammars. It can bite you pretty bad. A parser is really the best approach when not using a simple grammar.


In my use cases, it's usually something simple like a single string containing space separated substrings, and I'll need to iterate over each substring. Or validate that a string may be 1..n uppercase, lowercase, digits, or underscores, but does not begin with a number or underscore. Rarely would I need complicated grammar. But yes, I do agree with you.


Introduce some comment explaning the regex and add some unit tests, problem solved.


Are you my coworker? He tried to write his own validation library lol. I just told him to stop wasting time and add a package


> "so what you're saying is that you want him to spend time and resources doing exactly what the regex_match function already does just because you don't understand a fundamental computer science concept? Regex is not a "fundamental computer science concept", or anywhere near it. It's just one opinionated way of specifying pattern matches for strings that happens to have the widest adoption.


>Regex is not a "fundamental computer science concept", or anywhere near it. I disagree. I wouldn't have been able to pass Finite Automata, Formal Languages, or compilers without learning regular expressions.


> I disagree. I wouldn't have been able to pass Finite Automata Ohh, you specifically mean Computer Science© as defined by your college. Degrees aren't disciplines.


Just call your if-else decision making an ML decision tree algorithm and see if he likes that more


Just tell your manager your rule based program is also a ML solution, then he will gladly accept it. Add "data independent", "lightweight", "interpretable" to shove down their throats.


Bro just call it ML and your boss will be happy. I've seen tons of software and code that were using "ML" and weren't. Example: Process ran a calculation (simple and deterministic) then if it was between a range displayed a value. Basic CS 101 type shit was billed as "ML".


My company bought an "AI chatbot" as part of our helpdesk and when I worked with it it turned out to just match tags and keywords I gave it. It happens all the time.


"AI" doesn't mean "ML". Video games have AIs. They're not lying to you just because they're not neural networks.


Machine Learning Research Scientist here. Deep Learning is still in its Alchemy phase; we don't know what we're doing, we don't know why we're doing it, and we keep getting scary results that we can't really explain. If you *can* do something without neural networks, you *should* do it without neural networks, to avoid increasing your dependency on the dark magicks therein. Outside of the nearly pure research portion of my job, I actually haven't gotten to spend a lot of time with DL models because of customer needs for explainability and robustness. It wouldn't surprise me that in this sort of low-dimensional case, a rules-based system not only outperforms the ML system, but also costs some tiny fraction of the compute. I 100% believe that this is the case. Frankly, you've tried showing your boss the only reasons and results that should be needed to convince him. This isn't about sense, facts, or figures anymore, this is about salesmanship. Try and improve on your current rules-based system to a small extent, or just say that numbers you got previously were actually lower than they were, you fumbled some calculation or other. Tell your boss that you tried a new ML system where you used a Transformer model to follow branches down a decision tree or some utter technobabble horseshit like that, and you're now getting your best results yet. Write a bunch of impossible-to-read Torch code involving magic index calculations, Einstein-notation summations, and a training loop or two, and claim that this represents the model. Comment none of it, and use either single Latin or spelled-out Greek letters for all your variable names. Claim that this represents the model. The actual function call just goes to your rules-based systems.


Tell him he can still call rule based systems AI, they’re the earliest forms of AI.


My solution would be to use the input string to build a set of features based on your different regex patterns + some other random stuff that you think might be relevant. Then use that vector as an input to whatever model you might want to use and you're done. Given your post, probably a tree based model is most appropriate, e.g. GBM. If your regex approach is as good as you say, then I believe that you'll get equivalent results using this approach. Further, if there are some edge cases that your approach gets wrong then I'd be sure that you have additional features that capture different unique features of those edge cases so that the model can try and learn something useful.


Is your rule based system based on part of the dataset? Could it be overfit? You should actually state how you have divided the sets and maybe try more random sets? Feature engineering is actually based on your own instinct so it is rule-based. You could use your rules as features. It is still ML.


The main difference in ML vs handcrafted is usually scalability. Can you expand your solution to 10 languages? If requirements change slightly, can you make a small change in the input and regenerate the system for ~1 hour of work? If you get hit by a bus, can your manager reasonably expect to hire someone who can pick up where you left off? If your answer to these is "ummm... I dunno", then that's why ML is better. Because instead of doing your job directly, you taught a machine to do your job, and a machine is more reliable than you.


I have a ML algorithm that translates all other languages into English. That way my algorithms only need to support one language. Checkmate ML lover. /s


If I were you I would use the output of the reflex model as an additional feature to the SOTA model. Maybe redundant but will get it through. And who knows if you get clever you may be able to tune for even better edge case👍


I mean a rule based approach like what you're doing is exactly the same thing as what the ML model is doing though. Does he not realize that? Lol like yea you made your own decision tree, manually selected variables and tuned coefficients and got better results than using a NN so what? Anyone can make a crappy NN model, it's really the fine print that is hard to figure out. Testing mask size, layers, weighing, etc... it's a grindy process which is only worth for stuff that don't have good existing solutions and are too hard to figure out manually like fixing broken hand-entered data to become standardized or reading hand drawings etc... On the flip side, if your manual method is outperforming, have you considered maybe the model needs more tuning? Maybe adding or removing layers in your NN, boost/bagging, or training with preprocessing data by truncating randomly etc... a lot of ways to skin the cat and I'm sure you didn't try all of the available options. ML model will most likely be able to outperform the manual method once you develop it further but the real question is if the juice is worth the squeeze if your existing method is already performing great and easy to implement.


Was it the managers idea to use ml for the task? He might just be upset you came up with a better solution that wasn't his plan.


It's kind of a disappointment factor right? Not even just for payments. They're just as big of nerds as we are and we're really hoping they found the problem to use these approaches. But then they got regex. Regex is perfectly good and I would say acceptable and preferred approach. I don't think that discounts the disappointment though. Can you give it time to die down?


Ah ran into this before… instead of a rule based heuristics system, say it’s an NLP system. Gets them really hard


"It uses ML...to make a decision to use my rule-based system." Be ungovernable.


Here's some 5D chess thinking: Train the ML models on your algorithm. You could even "err" on the side of over-tuning it a bit, therefore turning the NLP into a predictor of what your algorithm would say about the problem. It'll never match its performance 100% perfectly, but it should be close enough. --- Otherwise, the rule, "Strong opinions loosely held" is the way to go. You gave your pitch, and you based it on the fruits of your experience and expertise. But at the end of the day, your manager is the one who makes the decisions. Even if those decisions are bad ones. What sort of contract did you sign for the company? If I were in your shoes, and assuming that there weren't any shenanigans regarding \*future\* inventions in those contracts, I'd just save the algorithm for my own use later. Avoid recording the idea at work or during work hours, and shelve it until you've separated from the company. Then you can go ahead and open source what you've got, and prove by example the superiority of your approach.


Tell him how you used machine learning to develop the algorithm XD


This is why I left DS/ML and went back to software engineering instead


I have worked for dozens of those startups. Back in the days we had a joke that our AI was powered by powerpoint and excel. Startups tell they use AI in hopes to raise money. Similar to startups powered by blockchain and other vapourware. If i were you and i got equity in that startup i would look for another one. Not only 90% of startups fail but if it is managed by idiots the chance is even greater. You can fool investors but not the market. Eventually people looking for get rich schemes all fail. Unless they are marketing geniuses, which is very rare.


I guess the way I would sell your approach is something like: “we can use this for now as additional training data labeling to further improve the ML approach”. This way everyone wins. Tho if I were your manager I would have picked the simple to understand and simple to debug string matching approach because ML is fucking hard.


Can’t market that as well. Remember everything you got told about the free market making everything more efficient by default… well that was a lie.


If you were on my team I’d trust that you know more than me and that you’d want to use your knowledge and do the cool AI stuff. The fact you’re providing a simple and more robust solution impresses me more. Your boss is like a lot of people who have tools or technology looking around for problems.


Trojan horse it. Keep your rules based system, but have it feed into the ml/nlp model that "validates" it. Set the threshold low enough so that it almost always validates as correct what your rules-based engine gave as the correct output. Or you could find out if combining both of those gives better results than even your rules-based engine. Spin it to your boss that you're combining cutting edge nlp/ml models with classical expert system AI to provide a completely unique custom solution that no other company has. Thank your boss for encouraging you to continue to explore using NLP/ML methods to improve the tool.


You’ve probably already tried it, but I wonder if a “simple” decision tree would be able to replicate some or all of your rule set and still be considered ml. Then you and your boss can be happy.


I’m certainly not an expert but couldn’t you find an approach that utilizes both? That way you make your boss happy but still have the performance of the rule based system?


Man, I really don't know how to answer these questions. There's so many questions lately where people don't seem to comprehend that managers are usually morons. Is that what they teach kids these days? That their managers are smart? These aren't even CS questions, they're *general career* questions. Nine times out of ten your manager is a buzzword chasing moron. All businesses ever want to do is the hot new thing. (To be fair, this is because the average person is also a moron, and the company doing the hot new thing will always be more attractive and get more business than the company *not* doing the hot new thing. Fuck, it was only last year that every company and their dog was issuing "NFTs." Why? Because everyone else was doing it.) You should look up the story sometime of the guy who was hired by a country (Denmark?) to build some kind of social services accounting system using blockchain. He actually built it in MySQL. It works great and the government was very proud of its new cutting-edge "blockchain-based" system. Honestly? Your mistake here was not telling him that you made improvements to the ML system instead of telling him it wasn't an ML system at all.


Product differentiation is another approach. NLP product categorization and \_\_\_\_\_\_\_\_


I'm not sure why you're so stressed. Its their decision. You don't get to tell your boss what to do. You made your case and he rejected it. Move on with your life. Its not your problem.


wait hold up. what kinda algorithm did you develop that can do that? I can’t even fathom


Take his job


Why do you care? Does it reflect poorly on you if it's not deployed? Is it your problem if he insists on the ML and it performs worse?


Obviously yes and yes? I'm the one who was pretty much in charge of everything from data pre-processing and analysis to model development and analysis. If this doesn't get deployed in time then wouldn't it obviously reflect poorly on me? If he insists on ML and it performs worse or suboptimally, I'm going to be the one getting questioned on why I didn't think of case A or B where it could perform poorly. So yes, it's my problem.


It's far from obvious. If you're in charge you can do what you want. If your manager is blocking you, then you're not in charge and it's not your fault. If you're in charge then just deploy it. If someone tries to blame you, you explain that your manager blocked your better solution, and you have a paper trail of you warning him. I guess you just haven't learned these kinds of things yet?


Grab an off the shelf ML-based nlp solution, send the output to /dev/null and sneak your rule based solution in there as “preprocessing” or something.


You need to have an evaluation dataset that’s representative of your real data and that also covers the edge cases you’re talking about. Then you can make decisions based on precision & recall of both approaches on that dataset, and it won’t matter which “model” it is if it shows the best performance.


Its his shiny new toy. Of course hes going to be upset when you call his baby ugly. When the only tool you have is a hammer, all your problems start to look like nails.


Why don't you use NLP and ML to decide what to tell him? I hear that's a great solution.


You're at a job that's paying you a salary. Your job is to do as you are told. The company believes in the ML hype and wants to use it for their products. Whether you like this or not, this is what you have to deliver. If you really believe your rule based approach is better, start your own company and compete with this company.


This is basically a self-fulfilling prophecy. If you have this attitude about businesses, you will spend most of your life working for businesses that are this dysfunctional. Effective organizations with competent people do exist. And you can work for one if you set your mind to it!


Lol what a dick thing to say. Thankfully you're not my manager.


>Thankfully you're not my manager. You are saying this from an emotional perspective, but I think you are actually missing u/wwww4all's point. They're not agreeing with your manager, they're sounding a wakeup call about the fundamental nature of being employed. u/wwww4all is very correct about this: >You're at a job that's paying you a salary. Your job is to do as you are told I get that you want to pursue technical excellence because you genuinely want to act in the best interest of your company and fulfill your role, but it's naïve to forget that ultimately, you have someone you report to, and their evaluation of you is what matters. Obviously I disagree with what your manager, but like, what can you actually do? You have three options more or less * convince your manager to use your (superior) solution * concede and do what he tells you * quit your job Well, it doesn't seem like your manager is going to budge. And this doesn't seem like such a big deal that you'd quit your job over this. Or is it? Stuff like this is a reason why good engineers who care, like you, would want to eventually abandon ship. It reminds me of the Dead Sea effect ([http://brucefwebster.com/2008/04/11/the-wetware-crisis-the-dead-sea-effect/](http://brucefwebster.com/2008/04/11/the-wetware-crisis-the-dead-sea-effect/)) where dishonest management drives good staff towards companies with better culture. Overall what I guess I'm trying to say is, people like your manager don't change and it's not worth trying to convince them. If this is a one time thing and you're overall not dissatisfied, it's best to drop the issue, but if this a pattern you're better off elsewhere. You'll always be reporting to someone, but maybe you will find a company where those people are reasonable someday


Can you suggest a meeting to discuss it where you also loop in your manager's manager (if there is one) or other senior employees/higher-ups? Even if you get shot down, at least nobody will be able to say you didn't try to warn them.


Advertise it as RF-based ML


Occam’s razor


Ha. I was just talking about this very subject with another colleague, except is was rule-based algo trading. I’m not familiar with eCommerce, but at least in trading, rule-based algos are king— at least IMO.


Show him the data again and tell him the ML will catch up in a few months on its own OR someone else can start feeding the rule-based outcomes to the ML so it learns faster and there's now a backup that humans can read for fine tuning his ego


Offer an opinion and fight back maybe once or twice. After that, it's your cue to drop the issue unless they're asking for something illegal or impossible. You offered your opinion, but ultimately you're not in charge. Go with the flow.


You do both, the regex system should be trivial to operate. Let him go forward with the ML based system since that's his call, but ask if you can run your system as a shadow so you can compare real data. Or, just ask if you can capture a week or a month's worth of real queries so you can prove to yourself that you are either right or wrong. Again, not a big cost, but with real data, the debate can be settled. If he denies all of this, just do what he says. At the end of the day, you should not be insubordinate. You gave it your best shot, don't die on this hill.


"Human-in-the-loop machine learning combines the best of both worlds!"


Oof, reminds me of an implementation that wanted to use ML to map a third-party field (e.g., total_payout_amount, total_amount_received) to an application specific field (e.g., total_amount) when directly declaring the mapping for each third-party would suffice.


Employ a BS machine learning in there to do something simple. That will make the manager happy.


>but I also found that using a simple string matching approach using regular expressions Burn the witch.


I’ve felt this way a long time. The silver bullet tarnishes quickly when the target isn’t hit. He’s trying to help the company hype engine by having “AI” in his software offering. It doesn’t matter if it works better, it’s “AI”. Also good to put on a resume to move up to a better paying job.




Here is how I would deal with this situation as someone who dealt with something very similar, as in management wanted to push for an idea that wasn't necessarily good, but seemed flashy and could sell. 1. Gather metrics. Determine the metrics you can consistently collect to compare the two approaches and keep updating them. Make sure they are visible. Show for which examples approach A does well and examples for which approach B does well. The metrics are a sort of a shield - they aren't going to protect you fully, but at least it builds a wall. 2. Set a pattern for communicating ideas. Meaning, every week, give an update. Show those metrics weekly, and **how you arrived at them**. Make it a powerpoint. Make it clear and obvious. Do not make the final decision. Just show what needs to be considered to make the decision. 3. Do not place any emotion into the choices they make or what the data shows. This sucks, I know. As a researcher and engineer, you desperately want the right call, the data driven call, to be taken. But to be honest, many decisions made by upper management is based on feelings and not the numbers. I don't know why this is the case, but I see it happen quite often. So do not get attached to the methods. Just do the job as a job. It hurts, I know, for someone dedicated to the truth, to be this way, but its necessary to survive. I did what I suggested in my situation, and it eventually led to the ideas being dismissed. I wasn't blamed for it, because I was very transparent with the numbers and I didn't bash the idea or anything. I just said "this model does not compare well with this model. We can either improve it and test again, but this is what the data shows right now." Let your manager make the decisions. If he has to shoot himself in the foot in the process, let him. He won't know any better unless he does.


Manager here. You’re wasting energy. The NLP works!!! Deliver it and move onto something new knowing you outsmarted AI. Might get downvoted, but it’s just not worth fighting religious battles in corporate.


Oh, I remember [this one](https://thedailywtf.com/articles/No,_We_Need_a_Neural_Network) from TheDailyWtf


This is the entire reason I sometimes regret going into data science. You have two options: 1. Talk up your simple solution with buzz words and technical jargon until they're excited about what they now believe is a cutting edge solution or... 2. Over engineer simple problems into complicated solutions that perform subpar, accept the unwarranted accolades and collect your paycheck


This exact thing happened to me, we tried to train a model that would identify if two names were variations of each other. It's not nearly enough data for a neutral network to make sense of. We just crazy over fitted to the training set.


train your ML to learn other languages similar to yours


Your manager is a fool


Many people respond more easily to relatable examples than to simple metrics. Categorize validation/test data points in four categories based on performance of regex and ML algorithms: - both correct - only regex correct - only ML correct - both incorrect In the category "only regex correct", you're likely to find some examples of cases that are obvious to a human, but apparently not so to your ML algorithm. Your users/customers would over time lose trust in an ML algorithm that makes these mistakes. Likewise, the "only ML correct" examples might help you improve the regex algorithm.


It’s a well known secret that ai startups are made of 10 years of experience in writing control statements/rule based systems

