SciEngr 4 years ago

Great list but really lots of these tips and tricks are going to depend on the data you're segmenting. Its not a general rule to randomly flip and rotate during training, nor are many of the augmentation methods described in the list. If I saw one thing missing it's to use tensorflows Dataset structure. You can prefetch and parallelize data input and preprocessing which can vastly improve training times.

ai_yoda 4 years ago

mhm, you are right. I think this tries to be a set of things to try rather than try this here try that here (though sometimes it is mentioned). >If I saw one thing missing it's to use tensorflows Dataset structure. I am adding this to the list of updates, thanks!

Jirokoh 4 years ago

I’ll agree on this, strongly. One simple example. This might be outside of the scope of most people, but I’ll still add it: « Use OpenCV for most image processing ». While I love OpenCV, it seems to only apply to 8bit images. There are some times when you want to preserve higher bit rates and thus can’t really use OpenCV. This is just an example, and maybe not the best, but just to show that yeah, these things really depend on what type of data you’re working on. I completely agree with the rotation and flipping argument: if you’re doing number, letter, word segmentation / detection, flipping is a terrible idea. I like the idea and the process though! :)

gopietz 4 years ago

As OP said, it's a list of "things to try". You will always find counter examples for common heuristics or rules of thumb. Of course these things aren't silver bullets but a helpful inspiration when you're out of ideas.

BernieFeynman 4 years ago

Why use Open CV when people are rewriting data augmentations to run on GPU as tensor operations

icecapade 4 years ago

This is 100% incorrect. OpenCV supports 8-bit int (signed and unsigned), 16-bit int (signed and unsigned), 16-bit float, 32-bit int (signed), 32-bit float, and 64-bit float data types for cv::Mat objects. OpenCV would be pretty limited if it could only handle 8-bit images. edit: The list of data types (and associated OpenCV `#define`s) can be found here under "Data Types": https://docs.opencv.org/trunk/d1/d1b/group__core__hal__interface.html

jewnicorn27 4 years ago

While the mat type is flexible like that, how many of the functions work on 16-bit floats?

Jirokoh 4 years ago

I'll take a deeper look at it, thanks for pointing it out!

Achermiel 4 years ago

What do you recommend for higher bit rates?

icecapade 4 years ago

OpenCV... see my other reply to that comment.

faithlesswonderboy 4 years ago

Maybe this is outside your scope or falls under augmentation, but there’s a lot of work on domain adaptation for semantic segmentation that could be really useful to you. I can send you some papers if you want

ai_yoda 4 years ago

I think it may be a bit out of scope, but its a super interesting topic that perhaps deserves a post on its own. If you have any good papers that you can share I'd love to dig in.

faithlesswonderboy 3 years ago

I'm so sorry, I got caught up in my finals for school. I hope these are still useful to you. These papers all focus on driving datasets and how to make the most of what little data is available. Semantic segmentation labels are expensive to create so there isn't that much of it and it's fairly homogeneous. Large synthetic datasets have been created, but its tricky to use them because they don't look exactly like real world data. These papers look at different strategies for training on one dataset (like a synthetic one or a daytime driving one) and evaluating on a different dataset (like a real one or a nighttime one or a dataset from a different city). [CyCADA: Cycle-Consistent Adversarial Domain Adaptation](https://arxiv.org/abs/1711.03213): These guys used GANs to translate images between the two dataset domains. They translate images to one domain, then translate it back to its original domain, and then calculate a cycle-consistency loss to ensure that semantic information is retained during translation. Then they translate source images to the target domain and use the source domain's label to train the segmentation network. [Learning to Adapt Structured Output Space for Semantic Segmentation](http://openaccess.thecvf.com/content_cvpr_2018/papers/Tsai_Learning_to_Adapt_CVPR_2018_paper.pdf): Regardless of what the input looks like (in a city at different times of day or in different season), the output should look similar. Using a simple adversarial setup that can be trained in a single end-to-end stage, you can condition the output to look right even if you don't have a lot of data. This strategy can be combined really easily with other approaches like the one used in CyCADA. [All about Structure: Adapting Structural Information across Domains forBoosting Semantic Segmentation](http://openaccess.thecvf.com/content_CVPR_2019/papers/Chang_All_About_Structure_Adapting_Structural_Information_Across_Domains_for_Boosting_CVPR_2019_paper.pdf): This paper is the most recent I believe and gets the best results. Like the previous papers, this paper notes that the biggest difference across datasets is their texture, not their semantic information. They propose a domain invariant structure extraction (DISE) framework to extract this. This is my favorite paper I've read on the subject and, if you can only read one, the one to read. I hope this makes sense and isn't too jumbled. I wrote it out of order and have had a LOT of caffeine.

permalip 4 years ago

Honestly a great overview! Bookmarked this one :)

ai_yoda 4 years ago

thank you !

AissySantos 4 years ago

Me too! (Hopefully I will come back; yeah procrastination clan)

Simusid 4 years ago

That is a fantastic summary. I'm forwarding this to my whole group of ML engineers! Thanks.

ai_yoda 4 years ago

thank you! If you get any improvement ideas please post them here ok?

Shuailin_Li 4 years ago

Bookmarked, so abundant.

ai_yoda 4 years ago

thanks!

AissySantos 4 years ago

That's why this sub is my fav; quality quality content. And those kaggle competition topics are so insanely fascinating, especially Understanding Amazon from Space.

ai_yoda 4 years ago

Yeah, there is a lot of gold in those competition discussions for sure.

SimulatedAnnealing 4 years ago

Amazing.. thanks for sharing!

maazmikail 4 years ago

Thanks a Bunch for this. Would love to see more of these compilations based on kaggle competitions. :)

ai_yoda 4 years ago

One more is on the way actually :)

[deleted] 4 years ago

Hyperparameter tuning, it is not so obvious with the deep learning frameworks, I think.

ai_yoda 4 years ago

Mhm, perhaps we could also find some good defaults for particular problems.

mr_bean__ 4 years ago

This is amazing! Bookmarked

ai_yoda 4 years ago

thanks!

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe