OpenCV Image Processing Project

Awesome and very doable.
A 6DOF arm perhaps. Small precision servos and FDD /DVD mechanics would provide ready parts and more than adequate precision and accuracy.

3 Likes

Turntable Setup (for non-microscopic objects):


This image is taken from http://digitalscholarship.blogs.brynmawr.edu/files/2018/12/Photogrammetry-Background-and-Methods.pdf

This is one of the ways of making a turntable for non microscopic subjects. Here, we can notice the following:

  • The lighting should be uniform and diffuse. This is so that the object does not reflect any light as reflections can confuse the algorithm which may then produce faulty 3D structures.
  • The background should be of solid color and there must be no irregularities in the background. This is because when the turntable is rotating, nothing in the shot must appear to be stationary or the algorithm detects all the images shot through a single camera position and ruins the output.

Another crucial observation (based on the following video) is that (watch 7:08 onwards) :

  • It is better if the visible face of the turntable is textured rather than being plain white. Texture allows the algorithm to capture feature points in the background and trick it into believing that the camera is moving around the object and is not stationary.
  • If possible, spray chalk powder on the object to reduce its shininess.
3 Likes

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.7947&rep=rep1&type=pdf

This research paper looks at ways of implementing microscope photogrammetry and the challenges offered by the same.
It also talks about the how stereo-microscopes are the most suitable for taking photographs for photogrammetry and the idea of tilting the stage to give us angled images. A Digital Terrain Model (DTM) for getting contours of concrete surface is also generated by using the an aerial photogrammetry software known as VirtuoZo.

2 Likes

Update:

Comparision between openMVG-openMVS pipeline and Meshroom :

So far, openMVG and openMVS have given more than satisfactory results, providing a good 3D reconstruction with low quality images. However a major issue with these libraries was the high amount of time(~1hr on 10-20 images) taken for the final dense cloud generation.

Both libraries utilize the CPU for reconstruction, whereas the process could potentially be accelerated through the use of a dedicated GPU. However, openMVG currently doesn’t support CUDA, and openMVS doesn’t fully utilize CUDA(it only uses CUDA computing for refining the final textures).

We decided to test another open source library, Meshroom. It’s MVS portion uses CUDA, and was expected to provide quicker results.

Another benefit to using Meshroom was that it provides pre-compiled binaries, in contrast to openMVG-openMVS which need to be built from source before they can be used. This allows Meshroom to be setup a lot quicker.

Since it requires a higher amount of resources than our laptops provide, we turned to Googles free cloud service, Colab, which provides a linux instance with the following specs :

GPU : 1xTesla K80, 2496 CUDA cores , 12GB GDDR5 VRAM
CPU : Intel Xeon @ 2.30GHz(1 core, 2 threads)
RAM : ~12.7 GB
Storage : ~33 GB

Meshroom was tested on 2 datsets, a test dataset provided them and on our Rubiks cube photos.

Buddha Dataset(67 High Res Images, total time for final Texture generation was about an hr) :

Rubiks Cube dataset(13 low Res Images, ~10 minutes for final texture generation) :

Though Meshroom generated the final texture for the Rubiks cube almost 6 times quicker, the output was lacking and even broken in some areas. The Buddha dataset took over 2 hours on openMVG-openMVS, however the output could not be viewed as it was highly detailed and required more RAM than our laptops could provide.

So Meshroom, as compared to openMVG-openMVS, doesn’t work well with low resolution images, and requires a lot of images.

Hence we plan to not use Meshroom for the final output, instead we plan to use it as an intermediary step while testing to visualise how the output might look, and to know which faces require additional images for proper 3D generation.

3 Likes

Update:
We have decided to use Nema 17 stepper motor for rotating the turntable since it is more widely used and is relatively cheaper as compared to Nema 8 and other low load stepper motors.

Here is a list of available Nema 17 models
https://reprap.org/wiki/NEMA_17_Stepper_motor

It mentions the standard dimensions of a Nema 17 stepper motor, which we’ll be using to build the CAD model. It also lists out the holding torque, rated voltage, rated current, motor length and a few other properties of each model.

Since we will be using the turntable for light weight objects, we don’t require a large holding torque (25 - 35 N.cm should be sufficient).
For now we’re designing the model as per the dimensions of SY42STH47-1206A, which has the following properties:

  • Holding Torque : 31.1 N.cm
  • Rated Voltage : 4.0 V
  • Rated Current : 1.2 A
  • Motor Length : 40 mm
  • Shaft Diameter : 5 mm
  • Step angle : 1.8°
3 Likes

Came across this post shared by @damitr

2 Likes

This is the link to our GitHub repository where we have documented the algorithm. We have also uploaded results and output files of test images (Laptop, Cube and Buddha Data sets).

3 Likes

A list of useful references in the pdf.
http://www.cs.cmu.edu/~aayushb/Open4D/Open4D.pdf

Abstract

Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper, we present a new approach to novel view synthesis under time-varying illumination from such data. We present a new approach for reconstructing light fields under the varying viewing conditions (e.g., different illuminations) present in Internet photo collections. Our approach builds on the recent multi-plane image (MPI) format for representing local light fields under fixed viewing conditions. We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function, that allows for real-time synthesis of photorealistic views that are continuous in both space and across changes in lighting. Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time. We show how to learn a model of these effects in an unsupervised way from an unstructured collection of photos without temporal registration, demonstrating significant improvements over recent work in neural rendering.
https://research.cs.cornell.edu/crowdplenoptic/

2 Likes