HDR / Exposure Bracketing

I don’t have highest bit scans when working with TIFF. I can’t afford the storage space. I think it may have been a 8 bit scan and 8 bit in Lightroom.

When I get time I will do a shootout between 8 bit and 16 bit B&W. I know people overblow the RAW to JPEG debate, so will have to check out the bit debate for myself.

Here are tests of 31 generations. of JPEGs.

I work in a tremendous amount of areas for a broke, 1 person archive… film, VHS, audio, photos, magazine and publications, ephemera and have a huge amount of scans to deal with. Plus my own photography demands for storage is also very large. I can’t afford the latest LTO drives and have to deal with individual drives. The magnetic HDD are only so good for longevity. The SDD are terrible for longevity unless you keep them charged and can lose data fast in an uncharged state. So anything of archival import eventually gets put on M-Disc. And inkjet printable M-Disc is $$ and that is another costly area of my budget.

What I’m getting at is I can’t always work in the highest quality available and many times make do with ‘decent’ quality and not ‘best’ quality. Now they are talking about putting data on synthetic quartz the size of a post it note via laser. Who knows how much that will cost or if it is even feasible for the average Joe or Jane’s budget?

As far as color correction?

My video software is very basic and it does color red correction poorly. Talk to the users of the top post software to get an answer. I’ve seen a demo by Lasergraphics of an instant color correction of red faded film via their scanner software. Looked impressive. Since I can’t do much with red films I turn them into B&W for now. If the lotto ever cooperates then all doors are open. But until that time, this is how I do it.

You can talk to Perry at Gamma Ray, he is into HDR film scans.

He said he does HDR in the scanner versus my method of doing them in post. My still scanner does not scan for various exposures, so again, that is how I make due with what I got. The 2K Retroscan scanner can run scans at different exposures, but have no idea how I’d combine them.

Good luck!

As I am working on a Super-8 telecine based on HDR, let me throw in my 2 cents as well.

First, a color-reversal film like Kodachrome 25 has a tremendous dynamic range. When projected onto the screen, our visual system is exceptional in view these high contrast images.

Maybe in a few years, when HDR gear and formats are widely available, we will be able to work directly in HDR. Until than, we are stuck with display hardware barely able to display a 8-bit per channel image properly. Also, most common video file formats compress (among other things) the quantization fidelity of the color channels. So currently, a major challenge is to aquire the huge dynamic range of the original source material faithfully and later map it onto the reduced dynamic range displays currently available and in use.

HDR is used by a lot of people in quite different ways. So let’s clarify a little bit:

  1. The standard HDR aquisition pipeline uses a range of different exposures of a single frame. Normally, these different exposures are sufficient to calculate from a set of images the transfer characteristic of the imaging system. This is an essential step, as these curves are needed to map the normal images into the HDR color space. What you end up is an HDR image which captures ideally the full dynamic range of the scanned frame. However, these images displayed on standard equipment look rather dull. So an additional step is usually needed: “tone-mapping”. A tone-mapping algorithm converts the HDR image into an image which “looks right” on a normal display. There are a lot of tone-mapping algorithms around these days, and I can assure you that every one of them usually fails to achieve a satisfactory result. You end up with manual tuning the mapping - something you might not want to do for every single scene of a movie.

  2. Occationally, people do HDR-work with 12-bit, 16-bit, etc. “raw” material. Here, no HDR capture is involved at all, only the later “tone-mapping” process. At most, artifically “exposed” images are created as intermediate, but they are all based on the single raw image. This is not HDR in the true sense, it is just the tone-mapping part: reducing a high dynamic source to something a low-dynamic display can handle.

  3. There is another technique which combines a stack of differently exposed images into a single low-dynamic image. This is called “exposure fusion”, and that algorithm actually mimics in some sense the way our own visual system is conquering the huge dynamic range of a projected image.

I have worked with all three of the above approaches, implementing each of them in own software. From my experience, only approach 3., exposure fusion, gives you in the end satisfactory results. Most importantly, it is a mostly parameter-free approach. So a whole movie can be processed, from normal exposed imagery down to scenes which are 1 or 2 stops underexposed. I give here an example:

One the right side, you see best single exposure capture I could achieve for this particular frame. As one can expect, the highlights are blown out (cmp. the rocks on the left road side), while structure and color is lost in the dark areas (for example, the blue of the jeans is quite similar to the dark brown color of one man’s coat in the frame capture). Well, only the mid-tones are captured ok.

Now, to the left is the result of the exposure fusion algorithm. Both dark and bright image areas are better rendered, and the mid-tone range is still ok.

This particular frame is from a Kodachrome 25 Super-8 movie. It was digitized with five different exposures, spanning an exposure range of 5 f-stops. Anything less would result in loss of detail in either the bright or dark areas of the image. Here’s another example, captured and processed with the same transfer parameters. There is no manual adjustment done between the capture of the two scences:

Again, one the right side you see the best exposed single frame, on the left side the result of exposure fusing a stack of 5 different exposures, with a dynamic range of 5 f-stops.

Of course, capturing HDRs from a movie comes at a price: you need to stop the frame perfectly before you start aquiring images. Even the sligthest movement of the frame will ruin your result. Also, at least the cameras I had available have a substantial lag between switching the exposure value and actually delivering the requested exposure. It turns out that you need to wait between 4 and 9 frames for the exposure to settle, depending on the camera model.

All this makes HDR capture a slow process. My system currently needs for one sec (18 frames) of film about one minute of capture time. This is for a capture resolution of 2880 x 2160 pixel. For lower resolutions, the system is slightly faster. Exposure fusion is also a quite complex computational process. For the resolution quoted, it takes about 1250 ms for reading the captured frames into memory, and than about 3725 ms to compute the fused result - per frame. Again, lower resolutions are substantially faster to process.


Singe image HDR is sometimes called pseudo HDR, but don’t confuse the two. You will get different results from just plugging the single image into HDR software and hitting play versus processing 3 or more exposure in post that are used as if they were separate camera exposures. But it does not matter what people call it…I just do it.

If I just plugged in a single image to the HDR software the results would not be the same as the single image B&W HDR I sent in.

The best bet for HDR is if you can do in-camera, multiple exposures, but that is impossible when things move.

Here is another single image HDR with a number of post exposures. (Don’t remember how many.) I don’t have the original handy, but lots of detail was missing in the original. It is not exactly invisible HDR, but close enuf for me.

This HDR below is more of ‘hyper-real’ style HDR sometimes called painterly. Again, multiple exposures in post.

1 Like

Thanks guys for your replies, very very helpful.

I guess @cpixip I’m referring to exposure fusion. Your results are incredible, I’m not sure if I’m more envious of the closeness you’ve got to a projected image, or the fact you were able to shoot K25 in Super 8. When I started shooting film seriously the only supplies available were a few years past expiry with very variable quality (really dependent on how the seller had stored it and how honest they were about freezing it), and K40 had just been discontinued as well.

I’m hoping that the FLIR software I’m using won’t suffer from the exposure lag, as the HDR is a camera design feature and it has a high FPS capacity, but I can’t actually find any documentation so I’ll have to test it out and report back. My modified Eumig 610D only goes as low as 3fps so it will be impressive if I can get it to work without changing the motor out.

@bainzy - concerning “Kodachrome 25”… - well, my memory was twisted a little bit here. The Super-8 film stock I was actually using was Kodachrome K40. The Kodachrome 25 was the stock I used for 35 mm photography at that time. I do mix that up occationally…

Actually, I still own a a single Kodachrome 40 catridge from 1988, at that time priced about 19 DM, which is more or less equivalent to about 10 € nowadays:

Too bad there’s no lab any longer available to process it. There was still a lab in Switzerland, but this closed around 2006…

Anyway. From my experience, the exposure fusion algorithm (Mertens/Kautz/Van Reeth) is the best bet on combining different exposures of a single frame into something which is “viewable”.

Most importantly, it is basicallly a parameter-free approach (there are parameters, and for optimal results, you want to tweek them together with the specific camera you are using). Once you find a sweetspot, it does not matter much if you scan differently exposed scences or even different film stock - the results look ok. Great advantage in my view.

The original authors of the exposure fusion algorithm did not discuss too much how their approach actually works. Here’s my take on that: if you come from a photographic background you know that it was usual stuff in an analog photography lab to brighten and darken certain image parts which would come out too dark or did not show enough texture. You would select a main exposure time for the print and place your hand or appropriately cut-out paper pieces over certain image areas which would come out too dark otherwise. After that, you would do an additional exposure to further locally darken image areas which would otherwise be too bright in the final print.

In a way, exposure fusion automates this technique: they devised certain image operators looking for regions in each of the exposures of a scene which are

  • well-exposed
  • have a good color saturation
  • or a good local image contrast

Now, for combining these areas of interest, they choose a well-know image combination algorithm which is pyramid-based and was initially proposed by Burt/Adelson in 1983. That is more or less the basic idea behind exposure fusion, as far as I understand it.

In any case, the best part of all this is that you do not even have to bother to understand or implement this algorithm by yourself (well, I did that anyway… :joy:) - it is implemented as " cv.createMergeMertens()" in the opencv computer library. If you want to look at exposure fusion and all the other options (HDR-creation + tone-mapping) the opencv library has available: the following link shows you example code in C++, Java and Python (OpenCV HDR algorithms) which you can use as a starting point for your own software.

Concerning the exposure lag and so. There are two different points here which are important. For one, the image which is taken by the camera at a specific point in time is first processed in the camera, then transmitted to your computer and there basically processed again by the device driver. All this happens before your software even sees the data. That pipeline introduces a fixed temporal delay, which however could be taken into account by a proper software design.

IF you are working with low cost hardware, you are going to meet another challenge when attempting HDR-captures: your camera does need some time to actually reach the desired exposure level.

I specifically looked at three different low-cost cameras: the Raspberry Pi v1, the Raspberry Pi v2 and the see3cam_cu135 camera.

The Raspberry Pi cameras have all sorts of automatic stuff running which is difficult or even impossible to deactivate for full manual control. Specifically, you can not immediately switch to the requested exposure. It takes at least 3-4 frames until the actual exposure time is even close to the requested one.

The same is true for the see3cam_cu135 camera I am currently using. Here’s a plot of see3cam_cu135 data showing that behaviour:

What you see here vertically is the mean brightness of a fixed film frame. Horizontally, the frame number after an exposure event is displayed. As you can see from the traces, it takes 3 frames until the camera even reacts to the new exposure time (I started with two slightly different intial exposure settings, one close to 100, one close to 128 - that explains the two different lines at frame position 1-3).

This inital dealy is most probably the delay introduced by the camera+device driver pipeline.

After this inital delay, the camera switches to the vincinity of the exposure value requested, but not exactly to that value. Mostly, the exposure values seen at frame 4 and 5 do not match the requested exposure values. Even worse (look at the brown trace!), sometimes the exposure only decays very slowly to the value requested.

To make matters worse, how long that relaxation lasts depends heavily on both the inital and the final exposure values - and, as these traces show, you might have to wait more that 40 frames with that specific camera for the exposure to settle. It is clearly impractical if you are scanning thousands of film frames. If you do it anyway, you will notice some flicker in the exposure-fused imagery.

Actually, exposure fusion has the nice property of equalizing shortcomings of your camera images, provided you are using more images than necessary to achieve a certain dynamic range: errors in camera exposure will be reduced and camera noise will go down as well in the exposure-fused image.

With your FLIR camera, you might have a much better setup available than I have. I worked with such cameras when the guys were still “Point Grey Research”, long before they were bought by FLIR. But rapid exposure control was not something we needed and looked into at that time. I think it would be interesting for the forum if you can report your results here!

There can be two ways to vary the exposure to do HDR

  • Vary the exposure speed of the camera
  • Vary the intensity of the light source while maintaining the exposure
    In my project with a modified projector and a Raspi camera I use the first method. For a frame the process is as follows:
  • Set the camera to auto exposure and take a “normal” frame
  • Switch to manual exposure and multiply the exposure by a certain factor to capture an underexposed frame (factor 0.1 for example) and an overexposed frame (factor 1.5 for example)
    This gives me good results but the drawback is that you have to wait a certain number of frames between each change (about 8 for the switch to auto and 4 for each change of exposure) so the capture is slowed down
    I wonder if the second method could give good results and faster capture
    For this solution, there are two ways to vary the intensity of the light source
  • Constant voltage for example 12v and PWM
  • Variable voltage
    I am afraid that the first solution with a “rolling shutter” camera causes a banding effect and that the second solution causes color variations.
    Do anyone have any ideas about this problem ?
    Dominique galland

Hi @dgalland - well, I do have some ideas… :wink:

  • vary exposure - depends heavily on the camera characteristics. Specifically, how fast the camera responds to the command to change the exposure. Have a look at the second picture of this post which shows the slow response to an exposure change request for a specific camera.
    Actually, it would be great to compile for various camera this specific behaviour as it is nowhere found in any datasheet.

  • vary intensity of the light source - this is from my experience the faster way to grap a set of LDRs for HDR computations. If you work unsynchronized with the camera (my setup) you still have the delay through the software pipe (light changes → camera takes picture → camera sends it → low level device driver gets frame → finally, the frame arrives at your software). From my experience, this introduces anything from 2 to 4 frames of delay. Of course, this delay is somewhat constant and you might trigger a change of illumination ahead of time in order to speed things up. Note that any speed-up you gain here is multiplied by the number of frames you are taking, so it’s worthwhile to examine the above sketched pipeline in detail.

  • use autoexposure as reference - firstly: three different exposures are normally sufficient to capture for example Super-8 material. Going to 5 exposures (my system) reduces in addition camera noise and gives you even more headroom for post processing. Fixating the center exposure via autoexposure is a neat idea, as it kind of optimizes the dynamic range which is captured. However, autoexposure algorithms usually take several frames to lock onto their target which will slow things down. Also, because of the autoexposure, your material will be kind of equalized - you might need to restore the original look manually in post. Which leads me to another bullet point you did not mention:

  • working with raw - most challenging material (which is from my experience home-made Super-8 or similar stuff) requires at least 12 bit of usable (read not: 12 bit as specified by the camera manufacturer :wink: ) dynamic range. Good modern cameras can deliver that. However, with 12 bit, you barely have any head room for under- or overexposed frames. If you want to be on the save side to digitize everything which is on the film, 14 (usable) bits is probably a better choice. However, if you throw in your autoexposure approach, 12 bit should be sufficient. And with raw, you have anyway more work to do in the post, so the brightness correction mentioned in the previous bullet is anyway required.

  • PWM’ed light sources - that will depend on the PWM-frequency of the light source together with the specific exposure settings you are working with. Banding might occur in certain cases. On the other hand, there is readily hardware available for that purpose, for use in photographic and video settings. So it’s doable, but usually a little pricey. That’s why in my system, I opted for

  • variable current sources - the light intensity of LEDs is more or less proportional to the current which is running through them. For my system, I implemented software-driven current sources. I started with 10 bits resolution, but later raised this to 12 bit resolution. It’s basically a combination of a digital-to-analog chip together an operational amplifier tied to a power-FET in such a way that it acts as a current source for the LED driven. 10 bits work ok if you are aiming at working with 3 different exposures, 12 bit are necessary if you want to capture the full dynamic range of the movie material. More does not make too much sense in my system as the noise level is taking over in my setup.

So in summary, from my experience, changing illumination levels gives you a faster scanning time than changing the exposure times on the camera. Illumination changes could be realized via PWM (I have not tried that) or adjustable current sources (I implemented the later). If you have some small electronic background, building a software-controlled current source is not too hard. If not, you might try to use the PWM approach. It should work as well, certainly if using longer exposure times, but I have no experience with that technique.

1 Like

@cpixip thank you for the discussion
I had read all your posts (bracketing, lightning, …) including the electronic diagram for controlling the LEDs.
I believe we both work with a Picamera (V1 or HQ) but our choices are different:

  • I simply use a good quality white led spot with a diffuser
  • I bracket the exposure around the auto exposure
  • I capture in JPEG
  • As in Joe Herman’s project, I transmit the frames to a PC for processing (mertens merge, local calibration, …)

So yes there are delays (7 frames to switch to auto and 4 to change the exposure) so at least 18 frames for a frame captured which gives measured in the end and all inclusive at least 2s per frame. But I am satisfied with the result. For amateur films with basic cameras the brightness of the scenes is very variable and sometimes bad, so starting for each frame from auto exposure seems a good solution to me.
So it was more out of curiosity that I asked the question of the bracket with variable lighting but:

  • It seems that the PI camera takes as long to adapt to the new lightning as it does to vary the exposure (2 to 4 frames)?
  • I understood that you work with RGB LEDs and that the software corrects color variations. I think it couldn’t work with a simple white led ?
    In ideas to explore I also thought of a P-Iris lens in which the opening is controlled by a stepper motor !

Hello @dgalland - thanks for the feedback! I am familiar with your project and it gave me indeed some important guidance along my own design approach!

That approach seems to be a very common option in realizing a film scanner these days: pairing an Arduino for the low level stuff, a Raspberry Pi + camera (preferable the HQ camera) for image aquisition and a LAN-based transfer to a high-power PC for the rest. Joe Herman’s, yours and my system works along these lines. Currently, I am able to capture and store five different LDR exposures as (jpegs with a size of 2016x1512 px) in about 2 seconds. Post processing takes actually longer, I need about 2.5 seconds per HDR frame!

Concerning the autoexposure of Raspberry Pi cameras - yes, they need a few frames to adapt to new light settings. Never measured it because I am not using it. But since this is an adaptive algorithm, chances are that the time to settle to a new value depends on how high the step is between different illumination levels. From what I understand, the algorithm should lock onto the new exposure level normally within a few frames. By the way, since the Raspberry Pi foundation has introduced the libcamera stack, there is potentially the possibility to roll your own algorithm. However, so far I did not manage to install the libcamera stack properly, so I can not report on this further.

Concerning the light source I am using, that is a long story. I actually started with a white-light LED and a diffusor, but I simply could not get the diffusor to work ok. Also, I discovered that I had some weird film stock which would need some dramatic color compensation in order to digitize correctly. For example sequences filmed directly from the screen of a color-TV set in the 70s, or film stock where I forget to properly set the daylight filter of the camera. All of this pushed me into designing an integrating sphere with different LEDs for the primary colors. I based my initial choice for the LEDs on some research, taking into account the different sensitivities of my camera and the characteristic density of film stocks, only to find out that my initial choice did not deliver what I was hoping for. So I tried actually white-light LEDs and obtained resonable results. After this, I tried again several different combinations of LED wavelengths in a RGB-setup and finally arrived at some workable combination of RGB-wavelengths.

In hindsight, it may be that the broader spectrum usual white-light LEDs are better suited for sampling film stock. If you are unlucky, the narrow spectral peaks of a RGB-LED combination might just by chance pick up a dip in the transmission spectrum of the film dyes, leading to false colors. With a broad spectrum white-light LED, this risk is much lower.

Anyway, with different LEDs for red green and blue, I am able to compensate the small color shifts introduced by driving LEDs with different intensities. Here’s an example, namely the intensities of the different colors corresponding pure white at five different exposure levels (max. intensity is 4095, corresponding to 12 bit):

int RedLED[]   = { 113, 243, 525, 1094, 2146, 0};
int GreenLED[] = { 126, 303, 725, 1725, 4095, 0};
int BlueLED[]  = {  52, 111, 235,  470,  889, 0};  

As you can see, the green LED is driven by the highest current in each setting, the blue LED with the lowest one. Also, the relative amplitudes change in order to keep the white balance constant for the camera.This reflects the different responses of the LEDs to different driving currents.

Using an f-stop which is programmable is an interesting idea. That might be faster and easier to implement than a programmable light source. However, each lens has usually a well-defined f-stop range where the lens delivers optimal performance. Opening up too much or closing the f-stop to much will leads to increased blur. That might be a challenge with this approach.

So continue the discussion on other points!

  • You say that you can capture and store 5 LDR 2016x1512 frames in 2s (with the wait for each change of lighting?). It seems to me very fast for the HQ. What mode are you using? Full resolution + ROI (Zoom + resize) or 2x2 binned mode (binned mode has always given me bad results). This is why I am staying with the V1. I capture in V1 full resolution 2592 × 1944 with a ROI centered 1440 * 1080 and the maximum framerate measured is 10fps. The advantage is also that we only use the center of the sensor in which the lens vignetting, aberrations and blur are less.

  • On the other hand 2.5 seconds for post-processing (Mertens merge?) Seems high to me, are you doing it on the PI or the PC? I do it on the PC the network speed is sufficient to transmit the 3 bracketed LDR frames without slowing down the capture. (Note: If you do it on the PI you should do multiprocessing and not multthreading)

  • At the beginning of my project I did a lot of testing on the “lens-shading” correction, not really easy to do. I gave up completely to make this correction on the PC

Well, I am operating the HQ camera in video mode. In another thread on this forum I showed that in this mode even with rather low quality settings, the image quality of the HQ cam is quite good.

I usually work with a quality setting of 60% - there’s no visual difference between this setting and a 99% setting.

That is in stark contrast with the v1/v2 cameras, which would show a noticable visual difference between these two settings. If I want to speed up things a little, I drop down to 40%.

I assume that the operational mode I am using is Mode 2, 2028 x 1520 pixel. That’s is most certainly a binned mode.

For the timing: I checked the time stamps on some recent captures, just to be sure of the time needed for a single capture. Capturing, transmitting and storing a single stack of 5 frames with a resolution of 2016 x 1512 pixels needs on the average something between 1.76 seconds (q=40%) to 2.20 seconds (q=60%). This time includes film advance plus switching through 5 different illumination settings, taking an exposure with each setting and transfering the images to a remote PC.

The dependence of the timing on the quality setting hints that basically the file sizes of the images which are transmitted might be the main bottleneck of the whole setup. Indeed I think that the speed of the LAN-connection between the Raspberry Pi and the PC is a limiting factor. In fact, the scanning rythm speeds up a little when scenes with little details are scanned, and slows down a little at scenes with high detail and contrast.

Well, I do most processing on the PC. With respect to the timing, there are several aspects which come into play here.

First of all, it’s a difference whether you process 3 or 5 images within the exposure fusion approach. On my PC, an Intel Core i7-8700 CPU @ 3.2 GHz, 16GB RAM, I need with the opencv version of the exposure fusion algorithm 1105 msecs for a stack of 5 input images of size 2016 x 1512 pixel, but only 657 msecs for 3 input images.

If I do not process the full resolution (5 frames @ 2016 x 1512 px = 1105 msec), but for example 5 frames @ 1200 x 900 px, the PC needs only 386 msecs for the same task. This resolution is kind of close to the resolution you are working with. In fact, running the opencv implementation with your parameters (3 frames @ 1440x1080 px), I end up with a processing time of 353 msecs on average.

Now, I must confess that I am using in addition an implemention of the exposure fusion algorithm coded by myself, mainly because I wanted to be able to have access to various intermediate calculation results and parameters.

My python-only version is slightly slower than the original opencv implementation. It needs for 5 frames @ 2016 x 1512 px = 1424 msecs, which is about one-third slower than the (presumably C-coded and compiled) opencv version.

The aproximately 2.5 secs processing time per frame stated above come from the fact that in addition to the exposure fusion processing, the images have to be read from disk (150 msecs). Next, these images are aligned with each other (200 msecs), exposure fused (1424 msec) and finally written back to disk as a single 16bit .png-file (350 msecs). Add a little bit of book-keeping and debug displays and you end up with the stated 2.5 secs per frame processing time.

A last comment with respect to “lens-shading”. Lens-shading can in no way compensate the mixing of the color channels induced by the mismatch between micro-lens array on the v1/v2 sensor chips and the lens with a longer focal lenght you would use in a scanning application. In retrospect, it’s trivial, but it took me quite some time to realize this. Even if you can compensate the color drift of the v1/v2 cameras (I succeeded that far), you gain nothing, because toward the edges of the frame the color saturation of your image goes down anyway. So your approach of using only the center part of the v1-camera frame is certainly a valid one. For me, in the end the new Raspberry Pi HQ camera was the major step forward.

Yes, you probably use mode 2 binned and that’s why your capture is quite fast which surprised me. It would be much slower with mode 3 at full resolution. But I made comparisons with a SMPTE test pattern and you really lose resolution when binned.
Of course you have to stay in video mode otherwise it’s even slower.
The quality of the JPEG compression is a different issue, as it is done in the ISP and at the end of the pipeline I don’t see why it would be better for HQ than V1 or V2? I still think that with a quality of 40% we should see the JPEG compression artifacts. ?
I think some people get the wrong idea about the HQ yes there is 12MP but it’s because the sensor is bigger but the pixel size is the same. With my tests on the test pattern we do not see a difference in resolution between the V1 and the HQ.
For lens shading yes it is impossible to correct the color mixing with V2 but V1 does not suffer from this problem. I don’t have this loss of saturation in the borders. Note that I make this correction with the same algorithm but on the PC because it is more convenient. I do this at the start of each reel on the projection window and without the film.
The only real advantage of HQ is better dynamics especially in dark areas but HDR will correct it.
For me capturing an ROI at the center of the sensor is interesting to avoid lens aberrations. Without this, still with the test pattern and my lens, I really have a loss of sharpness in the edges and corners.
To speed up your processing on the PC, you should do multi-processing, but it’s a little more complicated! Even in multithreaded a python program uses only one processor. In my application I do the merge as the reception, no storage of LDR frames.
Finally it is true that with a bracket of 5 the network can be a bottleneck. The PI3 is not really gigabit because everything goes through the USB bus, on the other hand the PI4 does have an independent gigabit port.

Yes, indeed. In full resolution mode, the free-running transfer speed is slightly below 10 fps, but in the binned mode 2 the transfer speed reaches 30 fps (which is the maximum possible as I am working with an exposure time of 1/32 sec). Of course you are loosing resolution in the 2x2 binned mode, but 2016 x 1512 px is more than enough for my purposes. As an added advantage, the binning reduces the sensor’s noise level by a factor of two.

Well, the pipeline of the HQ cam uses certainly a different processing scheme. Have a look at my tests with the HQ cam with different quality settings. Clearly, the quality parameter is treated differently in the HQ camera compared to the v1/v2 sensors. There was also some discussion about that on the Raspberry Pi forums, but I cannot locate that post any more.

The main advantage of the HQ camera is that it features a 12 bit DAC, as compared to the 10 bit DACs found on the v1/v2 cameras. Also, if you use the 2x2 binning mode, you in effect work with a four-times larger pixel size than specified in the data sheet for this sensor. This tends to reduces image noise.

Given, with that binning mode, the effective resolution of the HQ cam drops down to 2028 x 1520 px, as compared to 2592 × 1944 px with the v1 camera, for example. But as you noted already, you get a much better camera signal, especially in dark image areas. That is my main concern, not resolution, as my movie format for sharing these old digitized Super-8 movies is usually only 960 x 720 px anyway.

Let me elaborate a little bit on the topic of lens-shading. Normally, you should do lens shading compensation with the raw sensor signal. Basically, because the lens shading tables are just local multiplier tables for the 4 color channels coming straight out of the sensor chip. The purpose is mainly for compensating vignetting effects by appropriately scaling the raw camera signal before it enters further processing steps in the image pipeline.

If you do the lens shading compensation on the PC-side, you need to use normally the raw image. This might be delivered as an appendix to the normal .jpg, if you instruct the Pi to do so. However, if you do lens-shading calculation on a jpg-image, that will only partially work - namely in the intensity range where the jpg-pipeline is more or less linear. It will certainly not result in a good compensation in dark and bright image areas, where the jpg-pipeline is highly non-linear.

Also, while lens shading can compensate for vignetting effects, it cannot compensate for cross-talk between color channels. That is simply not possible because of the mathematical operations involved. For compensating cross-talk, you would need a (4x4)-matrix operation for de-mixing the different color channels, but within the lens shading context, you have only a single multiplier available for each color channel.

Now it is important to note that when you pair a long focal length lens with the micro-lens array of a v1/v2-sensor tuned to a lens of say 4mm focal length, crosstalk is introduced between the color channels.

Here’s an old example showing that effect from a v1-camera experiment, where the camera looked at a pure blue background:


What you see is the spill of the blue color component into the other three color channels. Ideally, the blue channel should be a homogenous gray tone (it’s not because of vignetting and dirt on the sensor), and the other color channels should be just pure black. This is obviously not the case. Note the asymmetry between green channel 1 (spill left and right) and green channel 2 here (spill top and bottom). That is caused by the different geometric positions of these channels with respect to the blue channel. In any case, the important point is: you get a color spill if you match a v1/v2 camera with a long focal length and you cannot compensate this color spill within the context of a lens shading algorithm.

In the end, the color spill results in desaturation effects and slight color shifts for certain colors, noticable towards the edge of the image frame. This effect is much less severe in the v1- than in the v2-camera, but it is measurable and noticable.

Given, you can tune the lens shading at the start of a reel so that the empty gate appears pure white, and that setting will give satisfactory results, especially with the v1-camera. But there will be tiny color shifts towards the edges of the frame for any color which is not white/gray. However, since you are using only the center part of the camera image, chances are that this issue is not noticable.

Finally, I do not think that I can do much more to speed up processing with my hardware. Actually, line number 10 of my python script is already

import multiprocessing as mp

and I am employing some other processing tricks as well for speedup. That is the reason why my python script is only one third slower than the compiled opencv version. When I started with this code, processing times of the python script were on the order of 7000 msecs or slower per frame.

Anyway, it would be interesting for me to get some timing information about the standard exposure fusion algorithm on other hardware. The processing times reported above have been obtained with the following code segment:

import cv2
import time
time_msec = lambda: int(round(time.time() * 1000)) 

merge_mertens = cv2.createMergeMertens()

tic = time_msec()
result = merge_mertens.process(imgStack)
toc = time_msec()

Here, the imageStack is just the python list of the 3 or 5 source images for the algorithm. It would be great to see how fast that algorithm is performing on other hardware.

I don’t quite agree with you on the resolution.
Yes a vertical resolution between 1000 and 2000 seems sufficient to scan an 8mm or super8 film, I choose 1080 the usual HD resolution.
But there is also the resolution for the details of the image, for example on the edges and for this is in my opinion the binned mode is not good.
Below two captures with the HQ in 2 binned mode and 3 ful mode. There really is a big difference !
The third image is a full sensor capture in mode 3. The aberration in the edges is clearly visible.
The lens is a good quality 35mm CCTV lens, probably similar to the 16mm lens of the HQ

1 Like

Hi @dgalland - that is a very interesting observation! Never looked so in detail on camera images (always worked so far with the automatich sensor_mode=0), but I can support your statement.

Here’s the full scan of a resolution target (I do not have that nice SMPTE Super-8 frame you used):

In the full frame, differences are barely noticable, but if you cut out the inner two resolution targets and compare them (rotated 90° for convience):

you can clearly see the difference between Mode 2 and Mode 3 (you might have to click on the image to see it in full resolution mode). Note that I enlarged the original pixels in the above cut-outs two times compared to the full frame image in order to show the differences better.

What is also visible in these test images is that quite a noticable sharpening is happening in Mode 2. It’s less visible in Mode 3, but it’s also there. Looks like the sharpening step happens in the processing pipeline after the binning step which is usually performed right at the sensor. In order to get rid of this, you need to tune the sharpeness parameter of the Raspberry Pi camera down to -100 or so.

What might be interesting also for other people: with my setup, the free-running frame rate in Mode 2 hovers around 25 to 26 fps, while in Mode 3, the frame rate drops down to 10 fps. Actually capturing one stack of 5 differently exposed LDRs took 1.35 sec in Mode 2 and 3.85 sec in Mode 3.

Yes, binning does give a very bad result (blur and halo around the edges) which cannot be corrected in post processing. In full frame you seem to have no distortion on the edges, your lens must be better than mine but maybe it’s also without the extension rings? Unfortunately setting a smaller ROI doesn’t change the capture speed. Because the ROI is in fact a zoom towards full resolution.
For multiprocessing I have doubts about your program, the simple import is not enough, you also have to implement a sort of dispatcher which distributes the frames to the processes, is that what you are doing?

well, the blur comes from the binning operation; as binning is usually done in the analog domain on the sensor chip, the transfer function of this operation is not ideal. In a perfect world, you would use some gaussian shaped filter before reducing the resolution, but that is not possible at the sensor level of the pipeline. However, binning at that early stage reduces dramatically the amount of data which needs to be transfered, and that is one of the reasons such modes exist. The other reason is the aforementioned noise reduction.

In Mode 3, the full sensor frame is transmitted to the Raspberry Pi and the scale reduction is done on the GPU. So no speed advantage here (therefore: max. 10 fps). But here, on the GPU, you have the possibility to use a better filter before down-sizing. That’s why the Mode 3 image looks better in the details.

Sharpening is happening on the GPU for any mode, but before any downsizing occurs. That’s why the halos are annoying in Mode 2 and barely visible in Mode 3. If you turn the sharpeness parameter of the Pi pipeline down, the halos will go away. The optimal setting for sharpening will depend on the MJPEG-compression chosen.

The lens that I am using is the humble Schneider Componon-S, 50 mm, which has been discussed before in this forum. Originally an enlarger lens, it delivers a very good image. Be aware that there is a light pipe build into this lens that you will need to cover up if you use the Componon-S as an imaging lens. One can buy these lenses used quite cheap, and it is still produced by Schneider, as far as I know. I bought mine in the 80s for my color lab I had at that time…

Here’s a picture of how I use this lens:

As you can see, the basic configuration is a 1:1 image path. That means that the distance between film gate and lens is about the same as the distance between lens and sensor chip, and both distances are approximately two times the original focal length of the lens. In fact, the focal length is 50mm, and the distance between the lens mount and the camera sensor is 91.5 mm. The Schneider Componon-S 50 mm has the best optical performance if you operate it at a f-stop of 4.0; lower values start to blur the image, while with higher f-stops than 4.0, the Airy disk gets too large.

The Schneider Componon-S 50 mm was designed for 35 mm film, so the distorsions for Super-8 are negliable. I do own also a Schneider Componon-S 80 mm, this was designed for the 6x6 cm format and might be a perfect lens for digitizing 35 mm with overscan. Anyway, I can really recommend these lenses.

Of course I do not only ‘import multiprocessing as mp’. I utilize at different places in the processing pipeline quite different schedulers to speed things up. For example, one of the simplest schedulers is the one utilized during resizing of the source images. Here, every image gets it’s own task and resizing the whole image stack is performed in parallel (my PC has 12 cores, and I work usually only with 5 images). Another scheduler is used in the aligment section of the pipeline. Here, the whole image stack is split into pairs of images which then are processed in parallel. Yet another one (used in calculating the feature maps the exposure fusion is based on) cuts the images in appropriate smaller tiles and processes these tiles in parallel.

Frankly, I am just fine with the speedups achieved. The pre-compiled original exposure fusion version right out of the opencv box needs on my hardware for my typical use case (image stack of 5 images with 2016 x 1512 px) on the average 1.105 sec, and my own version, running within the Python interpreter, needs 1.424 sec for the same task. That is only about 30% slower than the reference implementation and probably as fast as you can get within an interpreted environment. If I would need a further speed up, I would probably opt to implement the whole algorithm on the GPU - I have done such things before, but that’s why I know that it’s a lot of additional work. So I probably won’t go into that, at least not in the near future. :smirk:


Your XY adjustment platform looks great! Would you be willing to post parts and design files for that? Forgive me if you already have.

Hello Matthew, at theis point in time, the design is barely usable, so I would rather opt to not share it rightaway. But, at the moment, I can describe the design of the xyz-stage and the current status of it. That might be of interest.

There are several reasons why the design is not up to the task: first, the 3D-printed structure is just not as rigid as, for example, a professional xyz-stage (which start at slightly above 150€ on the usual shopping channels). Secondly, the plastic “bearings” I miss-used in the construction of the sliders have a slighly too large gap between them and the axis they are sliding on. Lastly, the 3d-printing of that stuff needs quite some tuning of the 3D-print process in order to obtain good results.

Anyway, here’s a sketch of the basic principle: the xyz-stage pictured in the comment above is actually a xyz-stage composed of three identical slider units, connected with three different connectors so that all three axis can be independently adjusted (that point is one of the points which does not work with the current version).

A single slider unit looks like this:

There is a base which supports two holding blocks for two Ø 5 mm steel axis and a standard M6 threaded rod in the center. This threaded rod is fixated between the holding blocks with two stop nuts. The fixation via the stop nuts is a little bit difficult to achieve - too loose and your slider does move on its own, to tightened, and the thread is hard to turn.

The slider itself has two M6 nuts. The left one in the image above is fixed, the right one is able to move. One the left side of this nut, there is a spring (hard to spot probably) which keeps a certain tension between the fixed and the moveable nut. That is a simple way to reduce backlash.

I misused some plastic bearings I had for realizing the sliding mechanism. They can be seen in this picture in better detail (it’s an IGUS part, the dark brown circular thing):

As it turned out, the tolerances between the steel axis, the plastic bearing and the 3D-printed part are just a little too large to fix the movement of the slider reliably in only one axis (which is the intention of the whole construction). The slider can also tilt a tiny little bit around an axis perpendicular to the main axis. So far from all stages I printed, only one of came out with negligible tilt, just by chance probably. There’s certainly room for optimization here…

The units are designed on purpose to be very slim, because three of them are put together into a xyz-stage with special connectors. You can see them (the colored pieces) in the following rendering of the stage

During the initial tests I found out that I would have actually needed an additional, fourth axis - namely the rotation of the camera around the optical center of the frame. But I found no easy way to implement that. So instead, I measured the necessary correction angle and reprinted the green connector in the above image with an angle slightly different from the initial 90°. That worked.

Anyway. At the moment, this design is better than what I had before, but there are certain points which need improvement:

  • I printed this in PLA. Compared to steel or aluminium, this plastic is quite flexible. If you touch the camera or one of the dials of the xyz-axis, the image jitters. It is not impossible, but it is hard to focus with such behaviour.
  • An additional drawback of that missing stiffness of the setup: even the slightest vibration shows up in the scanned images. In my current setting, a back of the envelope calculation shows me that a single pixel corresponds to about only 2.5 microns on the film. That is a tiny distance! Any movement will show up in the scan. So currently, the whole scanner has to be placed on a platform which is sure not to move. I simply use my lab floor for this, but even then, very heavy footsteps can be “recorded” by the scanner…
  • The tilt of the slider which I mentioned above certainly needs to be reduced. One way to do this might be to design a better fit between plastic part and steel axis. I need to look into this. Another option would be to increase the span the slider is using. Each slider is currently about 28 mm in length which gives me a rather large sliding range of approximately 50 mm. I do not need such a large sliding range on any of the axis. Increasing the span of the slider will reduce both the sliding range as well as any tilt I want to avoid.
  • Another thing I discovered during testing: the M6 thread I am using has a pitch of 1 mm per revolution. For focus setting, actually a slightly larger pitch would better match my taste. But I think I will not bother with this.

There might also be other points in this setup I have not yet discovered.

Well, what might be of interest to the community is the design of the camera body:

The distance between camera and lens is fixed at 91.5 mm; as explained above, this gives you approximately a 1:1 optical imaging with the Schneider Componon-S 50mm, with some slight overscan of a single Super-8 frame.

Since my printer is unable to print the M39 thread of the lens, I opted to reuse some photographic mounting plate I had. Since the screws of this plate are very tiny, I created not holes but tiny slits for them (visible in the above 3D-rendering).

Note also the circular cone just after the mounting plate - this gets rid of stray light from the light pipe of this lens.

To further reduce stray light, I covered the inner section of the camera body with matte black paint.

Finally, here’s an image of the “business side” of the camera body, holding the Raspberry Pi HQ camera:

And here’s a link to the .stl-file of that camera body.


This is fantastic, thank you for the documentation and explanation. This will give anyone wanting to make their own a lot to go off of. When I get to that stage I’ll definitely be using it as a reference!