Agreed. It’s a slightly unfair comparision…
Well, certainly the specific frame you use for testing does not pose a too great challenge. It seems that it was taken on a sunny day with overcast sky. Besides, you are using the modified tuning file for the IMX477 for capture - which was developed in order to improve these things.
For some examples on the difference between single scans and exposure-fused footage, have a look at this post. These scans were taken in the early days with the standard tuning file of the IMX477. Since than, things have changed; I now use only 4 different exposures while scanning.
I did here a comparison between the results of exposure fusion and raw development. In summary: if you get your exposure right, the raw capture is mostly equivalent to an exposure fusion approach. However, while the later automatically creates a visually pleasing image, the former requires some kind of raw development - at the time of the test, the color science of developing raw images from the IMX477 was not present (this has changed in the mean time), so it was no option back then.
I think if you compare your scans closely , you still notice some tiny differences especially in the dark shadows of the tree. For such areas, the camera signal from an exposure-fused stack just has a finer quantization level than the raw image file. Given, usually you will not need that precision, as these areas are anyway rather dark - but sometimes, you might want to get these areas brighter in the post, and than you will notice a difference.
With respect to your code - nicely done and I guess a great example for the forum. A few remarks: it takes some frames until an exposure value requested is actually realized by the camera. And sometimes, a shutter speed of 9600 is realized with a shutter speed of 4800 and a digital gain of 2. So you might want to check the metadata returned with the frame both for the correct shutter speed and digital gain value. Or: just wait for about 15 frames between the setting of the shutter speed and the actual capture. Another issue is hidden in the line
merged = np.clip(merged * 255, 0, 255).astype(np.uint8)
This tactically assumes that merge-process does deliver a float image in the range [0.0:1.0]. At least in older versions of the opencv software, this was not the case. So you might try instead something like
minimum = merged.min()
maximum = merged.max()
merged = np.clip( (merged -minimum)/(maximum-minimum) , 0, 255).astype(np.uint8)