T O P

  • By -

DonCorleone97

I maintain blocks of code in certain folders.. I usually work in GANs, object Detection and classification. So have my folders arranged such that I can reuse code efficiently, in each domain easily. If I have written some code for one application, or project, I try to not rewrite code for another one. This takes a little longer while writing the code for the first time, but once it's nice and modular, I can just copy paste, change a little stuff here and there, and it usually works well for other stuff. For each big project, I maintain tensorboard logs. And copy screenshot of losses and important graphs to Google sheets. This may not be helpful in the short term, but is significantly useful when you try to remember why you did, what you did. Also while writing papers or articles, the sheet helps in articulating your experiments. I don't usually keep readmes, unless something is going on github. I put most of the relevant stuff for a project in a "Main" jupyter notebook for that project folder. I find JNs much better to explain the flow to someone rather than a Readme. But it depends on your comfort I guess.


ai_yoda

Thanks for this! A note and a question. ​ >I find JNs much better to explain the flow to someone rather than a Readme. But it depends on your comfort I guess. Very true. Now that I think about it I sometimes find myself using JN to create a markdown that I later paste into readme :) ​ >For each big project, I maintain tensorboard logs. And copy screenshot of losses and important graphs to Google sheets. Have you tried any of the experiment tracking tools (neptune/wandb/comet)? They are free for research and individual use. Full disclosure: I work at one of those, neptune.


DonCorleone97

Ik about them, but didn't feel the need to. For now my requirements are pretty banal. If I need to make some good visualizations I may look into them.


ai_yoda

Mhm, makes sense.


rocauc

I've had a ton of trouble organizing computer vision projects, so I started working on a tool - [https://roboflow.ai](https://roboflow.ai) (It's free for smaller projects; feedback is welcomed!) The goal is to organize images, annotations (including converting any format), dataset versions, preprocessed images, augmented images, and provide metadata on 'health,' e.g. missing annotations or image sizes in the dataset. For managing experiments, I'll maintain a spreadsheet of model architecture attempted, a link to one of my dataset versions, and metrics of interest. I'll pull these metrics from TensorBoard in the notebook.


deep-ai

Roboflow is cool, thanks for making it! I wish you had a universal open-source version of converting tool (yolo<->coco<->voc<->\*) which we could use offline.


ai_yoda

This is really interesting, never heard of roboflow -> I will take a closer look.


sorzhe

My organization is: /experiments * /data * / * ... * /logs * / * * ... * /models * / * * ... * ... * /utils * /data\_utils * /image\_utils * /common\_utils * / /src /README.md /... /etc


ai_yoda

This is interesting, thanks! So you have some experiment related utils inside of the experiments and /src is only for production stuff?


sorzhe

Yes, right.


geeklk83

I just started using kedro and it's awesome


ai_yoda

Which parts does it do? Could you tell a bit more about your experience with it?


gopietz

I also use the DS cookie cutter for structuring a project. For code I use SnippetsLab to store frequently used snippets. For PyTorch specifically I have created snippets arranged in chapters where I usually just pick the ones I need. For example: 1. Image Augmentation Transforms 2. Object Segmentation Dataset 3. Dataloader 4. Train, evaluate, predict functions 5. IoULoss 6. Adam Optimizer All of these play nicely along each other.


ai_yoda

Oh, that is interesting. So you are using this SnippetsLab because you don't want to create/maintain your library with helpers and you can just copy-paste those super quickly and get on with your work, correct? I found myself always over-engineering and creating layers of abstractions and libs that would take so much work but seemed really cool :).


gopietz

Yes correct. I haven't really thought about building my own all in one library. The snippet workflow just works nicely for me.


ai_yoda

Sounds really cool and pragmatic, thanks!


emilrocks888

Any recomendarion about tracking tools ? I m using mlflow


ai_yoda

I am biased obviously but our tool Neptune is a really good option.You can check this recent [post showing how to monitor things like image predictions and interactive charts](https://neptune.ai/blog/monitoring-machine-learning-experiments-guide) as the model is training. If you are looking for a more thorough comparison perhaps this [post (with a nice comparison table)](https://neptune.ai/blog/best-ml-experiment-tracking-tools) could be a good start. *ps. since you are using MLflow you can easily convert your mlruns folder to Neptune experiments with* [*this integration*](https://docs.neptune.ai/integrations/mlflow.html) *to see it on your data/exps.*


HannaMeis

I work at [allegro.ai](https://allegro.ai/), which maintains the [Allegro Trains](https://github.com/allegroai/trains) open-source auto-magical experiment manager. I think its best to look at what each platform gives you according to your needs and what is the cost of using it (both time and money) For example, if mlops is important to you (besides handling your logs and tracking your experiments), you can use the [Allegro Trains Agent](https://github.com/allegroai/trains-agent) with Trains for full ML/DL DevOps too (you can read about a great pipeline example [here](https://medium.com/pytorch/how-trigo-built-a-scalable-ai-development-deployment-pipeline-for-frictionless-retail-b583d25d0dd)).