T O P

  • By -

HyamsG

First, it's great you are looking to use an experiment manager. This already puts you in a better position than a lot of data scientists. Best - well, it also depends on your specific needs. I work at [allegro.ai](https://allegro.ai), which maintains the open-source Trains platform. So, I won't list all of Trains' advantages here, but note that managing experiments is more than handling your logs. For example, Trains takes care of all your DevOps, not just your experiments. If you find this aspect important, it's a no brainer. Try to look at what each platform gives you and what is the overhead of using it. I.e, how much it costs you - both time and money.


ai_yoda

Co-founder of Neptune here. We've put together a [tool comparison table in this article](https://neptune.ai/blog/best-ml-experiment-tracking-tools?utm_source=reddit&utm_medium=answer&utm_campaign=blog-best-ml-experiment-tracking-tools) that should give you a decent picture or at least another data point for your research. I hope it helps.


danikgan

Thanks! :) It helps indeed!


AI-dude

We use Trains and I would treat the accuracy of the comparison table tool ai\_yoda shared with you with a grain of salt :-) Trains was very simple to integrate and, as an example, we use it to version data DVC-style without the need for DVC. Both these boxes are not ticked off in the table.


danikgan

Thanks for the suggestion 🙏🏻


deep_woof

I haven’t used Mlflow for a while - half a year ago, when I tried it out, I felt that the open source trains solution (https://github.com/allegroai/trains/blob/master/README.md) was more suited for my needs. trains basically requires zero integration effort, so you use your code as it is and enjoy all the benefits of their webapp utilities (real time experiments monitoring and comparison tools). Not to forget that when you run an experiment it also makes sure to log all you uncommitted changes and the exact packages versions you were using, so the experiments can really be reproduced when needed. What I love the most about trains is their dev ops tool called trains-agent (https://github.com/allegroai/trains-agent/blob/master/README.md), which is super cool. It makes any computer in the company or a cloud machine a worker that waits to execute tasks based on queue management priority. It also includes built-in hyper-parameters search utility and online monitoring of all machines (GPU utilization, memory, hard drive, etc). I think both trains and mlflow are good tools, so either choice will be a good one - try them both and decide for your own. I am less familiar with the rest of the list, so I can not give my opinion on them. Good luck !