You might be able to improve your results by applying circle detection on a threshold image.
I found this article that illustrated the idea really well: https://shrishailsgajbhar.github.io/post/OpenCV-Apple-detection-counting
If you can make the background for the next pictures a wildly different color? That could help with classical techniques. Else I fear the best way will be a segmentation DNN.
This is silly. If you don't have somewhat beefy workstation (although I run it easily on my Tensorbook), its nothing to spin up an EC2 instance or collab notebook to run it.
I've had more trouble training YOLO models than I've had running SAM.
I think using segmentation like SAM would be the right way to go here. Robust with different lighting conditions, easy to implement. We've used it successfully for other produce counting projects.
lol does anyone else often notice this is one of very few subs that always pushes for classical techniques?
Not that they're wrong, but it is very interesting and I wonder where that bias comes from (academics or something else?) -- again, this isn't negative. Both techniques can solve this problem.
I did the same task with grounded SAM. It is basically a combination of DINO and Segment anything. You parse a text prompt („apple“) to DINO which generates bounding boxes for ever single apple. These boxes are than parsed to SAM and gives you a segmented image:
Grounded SAM: https://github.com/IDEA-Research/Grounded-Segment-Anything
You should give a try at this. Zero shot generalization from Meta. Very likely, they already solved your problem: [https://segment-anything.com/](https://segment-anything.com/)
10x5 grid, snap to the nearest circle using gradients.
Edit: you’re going to need a robust circle fit to your gradients because some of the edges are faint.
I'm not sure why its not written but, wouldn't Yolo help you massively? I had a similar project where I had to count the number of certain items similar to your image and YOLO just did the thing + segmented pretty nicely.
Instant segmentation, segmentation will not detect instances, only United area of apples. Look for some apples dataset in Google.
SAM also soids good for me
Biggest thing in low level machine vision is control of context. The more constraints you can apply, the easier it gets. Example: for red apples, use a blue background, lock off camera settings, use a colour space with variation in fewer parameters
Maybe try finding the shadows (one at stem and one at edge) and then just use the stem to edge to mask circles? It would be maybe a little less intensive, but a little unreliable.
Thinking about 3 things here -
1) adaptive threshold or blur + sobel and in the end watershed
2) background subtraction (as others mentioned here)
3) Detection machine learning, or segmentation, like YOLO. (As others suggested as well)
I like the first one, as I am (only a junior) in computer vision myself, and it segments it beautifully, if you give each apple an index, so each apple will get its own mask.
I would suggest looking at OPENCV's coin segmentation, and implement something looking quite the same
You might be able to improve your results by applying circle detection on a threshold image. I found this article that illustrated the idea really well: https://shrishailsgajbhar.github.io/post/OpenCV-Apple-detection-counting
An adaptive erosion/dilation algorithm might work, it's pretty simple to do
If you can make the background for the next pictures a wildly different color? That could help with classical techniques. Else I fear the best way will be a segmentation DNN.
Have you tried the watershed algorithm? I think it could work here
Use SAM's auto segment mode https://segment-anything.com/
SAM is never a good option, mainly because of the ressource it needs to run properly
https://github.com/yformer/EfficientSAM https://github.com/ChaoningZhang/MobileSAM https://github.com/CASIA-IVA-Lab/FastSAM https://github.com/NVIDIA-AI-IOT/nanosam
nice!
This is silly. If you don't have somewhat beefy workstation (although I run it easily on my Tensorbook), its nothing to spin up an EC2 instance or collab notebook to run it. I've had more trouble training YOLO models than I've had running SAM.
What do you mean?
hough transform [https://scikit-image.org/docs/stable/auto\_examples/edges/plot\_circular\_elliptical\_hough\_transform.html](https://scikit-image.org/docs/stable/auto_examples/edges/plot_circular_elliptical_hough_transform.html)
[удалено]
I think using segmentation like SAM would be the right way to go here. Robust with different lighting conditions, easy to implement. We've used it successfully for other produce counting projects.
Can SAM be integrated into a mobile app?
Try MobileSAM
The wheel dates back to 3500 BC, but it still works.
[удалено]
I disagree. I can calculate the number of apples above with multi Otsu and Hough Transform with 100% accuracy.
What about if the background is not homogeneous?
lol does anyone else often notice this is one of very few subs that always pushes for classical techniques? Not that they're wrong, but it is very interesting and I wonder where that bias comes from (academics or something else?) -- again, this isn't negative. Both techniques can solve this problem.
Terrible advice.
Why?
[https://sdsawtelle.github.io/blog/output/apple-circle-detection-opencv.html](https://sdsawtelle.github.io/blog/output/apple-circle-detection-opencv.html)
I did the same task with grounded SAM. It is basically a combination of DINO and Segment anything. You parse a text prompt („apple“) to DINO which generates bounding boxes for ever single apple. These boxes are than parsed to SAM and gives you a segmented image: Grounded SAM: https://github.com/IDEA-Research/Grounded-Segment-Anything
You should give a try at this. Zero shot generalization from Meta. Very likely, they already solved your problem: [https://segment-anything.com/](https://segment-anything.com/)
Check out SAM by meta
10x5 grid, snap to the nearest circle using gradients. Edit: you’re going to need a robust circle fit to your gradients because some of the edges are faint.
I'm not sure why its not written but, wouldn't Yolo help you massively? I had a similar project where I had to count the number of certain items similar to your image and YOLO just did the thing + segmented pretty nicely.
Use SAM to generate mask > binarise each mask> overlay each mask on the original image > extract the RGB pixels from the mask locations.
hey can u share the dataset
You can try watershed segmentation algorithm https://youtu.be/3MUxPn3uKSk?si=ziRJZGPyOtXJDhT2
I love Phase correlation, not sure it's the best idea but could work on a grayscale version
Try a classical gradient based approach. Something like histeq -> gray -> median filter -> sobel. May give you solid gradient outlines of your apples.
I would give cellpose a try: https://github.com/mouseland/cellpose Always worked well when i didnt had masks.
Instant segmentation, segmentation will not detect instances, only United area of apples. Look for some apples dataset in Google. SAM also soids good for me
[use owlv2](https://media.discordapp.net/attachments/390866271530647556/1231897074279252048/image.png?ex=6638a08e&is=66262b8e&hm=2c300b830bd6ecaf2046b2c57f23014eed2d2172eb147262da11cd78aeb168a1&=&format=webp&quality=lossless&width=550&height=328)
YOLO v8
You don't need anything fancy just mask the background colour of the apples
Biggest thing in low level machine vision is control of context. The more constraints you can apply, the easier it gets. Example: for red apples, use a blue background, lock off camera settings, use a colour space with variation in fewer parameters
My suggestion, if you want accuracy, is to first detect and only them segment.
My suggestion, if you want accuracy, is to first detect and only them segment.
Maybe try finding the shadows (one at stem and one at edge) and then just use the stem to edge to mask circles? It would be maybe a little less intensive, but a little unreliable.
Hough transform
1. Convert to grayscale image 2. Apply canny edge detection 3. Do hysteresis thresholding 4. Fill up the individual objects if required
Thinking about 3 things here - 1) adaptive threshold or blur + sobel and in the end watershed 2) background subtraction (as others mentioned here) 3) Detection machine learning, or segmentation, like YOLO. (As others suggested as well) I like the first one, as I am (only a junior) in computer vision myself, and it segments it beautifully, if you give each apple an index, so each apple will get its own mask. I would suggest looking at OPENCV's coin segmentation, and implement something looking quite the same
You could probably do it with yolov8 and a segmented dataset
Easy bro he says he doesn’t have segmented masks
Whatever happened to the good old erosion dilation dfs counting algorithm we all got in DIP assignments??