Enable multiband runs (single-band unchanged)#23
Enable multiband runs (single-band unchanged)#23Jayashree-Behera wants to merge 5 commits intoschlafly:masterfrom
Conversation
e9c901b to
701c4a0
Compare
schlafly
left a comment
There was a problem hiding this comment.
This looks great, thanks Shree.
Let's delete mosaic.py which will be broken. I think we will need some decam_proc.py updates but they should be modest.
I'll have some more comments tomorrow, sorry to be slow.
|
|
||
| return x[m], y[m], sigim, modelsigim |
There was a problem hiding this comment.
We should either document the new returned keys or not return them.
| if weight is None: | ||
| weight = np.ones_like(im) | ||
| wt_pad = np.pad(weight, [pad, pad], constant_values=0.) | ||
| wt_pad[wt_pad == 0] = 1e-20 |
There was a problem hiding this comment.
You're duplicating my existing code, but I can't figure out for the life of me why I did this. Let's keep this for this commit, but can you file an issue to check removing this after we merge this commit?
| # PSF stamp sizes | ||
| sz_b = [np.array([psfs[b][0][i].shape[-1] for i in range(N)]) for b in range(B)] | ||
| szo2_b = [sz // 2 for sz in sz_b] | ||
| stampsz = max(max(sz) for sz in sz_b) if N > 0 else 19 # for safety, I pick a common stampsz across both bands |
There was a problem hiding this comment.
I think this stamp size is used only for determining how much to pad; we need it to be the same in each image to keep the images aligned. If that's right, I'd say something more like "to keep images aligned" rather than "for safety."
There was a problem hiding this comment.
Thinking about this more, I had missed that this function doesn't do the padding. So x & y are in unpadded images and the images are padded. I think you could save some logic by adding the padding to x & y when calling build_sparse_matrix, and then maybe build_sparse_matrix doesn't need to know anything about the padding?
|
|
||
| # parameter count | ||
| ncol = B * N # fluxes | ||
| if psfderiv: ncol += 2 * N # shared dx, dy per source |
There was a problem hiding this comment.
| if psfderiv: ncol += 2 * N # shared dx, dy per source | |
| if psfderiv: | |
| ncol += 2 * N # shared dx, dy per source |
I didn't realize this was legal python. Let's not use this syntax.
| """ | ||
|
|
||
| # Detect single-band or multiband case | ||
| if np.array(ims).ndim == 2: ims = [ims]; psfs = [psfs]; weights = [weights] |
There was a problem hiding this comment.
| if np.array(ims).ndim == 2: ims = [ims]; psfs = [psfs]; weights = [weights] | |
| if np.array(ims).ndim == 2: | |
| ims = [ims] | |
| psfs = [psfs] | |
| weights = [weights] |
| modelst += psfs[1] * flux_b[1:len(x) * len(psflists[b]):len(psflists[b])].reshape(-1, 1, 1) | ||
| modelst += psfs[2] * flux_b[2:len(x) * len(psflists[b]):len(psflists[b])].reshape(-1, 1, 1) | ||
|
|
||
| # centroid outputs are zeros; stamps unchanged - This is to keep output structure identical to previous version |
There was a problem hiding this comment.
I think this is an outdated comment.
| # 3: weights, | ||
| # 4: psfs without shifts | ||
| res = (modelst + residst, imst, modelst, weightst, psfst) | ||
| stamps_b.append(res) # keep only stamps tuple |
|
|
||
| # print(f"[fit_im] total time: {time.time() - t0_all:.3f} s") | ||
| return stars, [models[b] + skys[b] for b in range(B)], [skys[b] + mskys[b] for b in range(B)], psfs, iter_history |
There was a problem hiding this comment.
This should probably become a dictionary as the number of things in the tuple gets longer.
|
|
||
| # Sources in this subregion (shared positions across all bands) | ||
| mbda_src = in_bounds(xa, ya, [bdxaf-0.5, bdxal-0.5], [bdyaf-0.5, bdyal-0.5]) | ||
| if not np.any(mbda_src): continue |
There was a problem hiding this comment.
| if not np.any(mbda_src): continue | |
| if not np.any(mbda_src): | |
| continue |
| if guessflux is not None: | ||
| guess = numpy.concatenate([guessflux, numpy.zeros_like(xn)]) | ||
| guess = np.concatenate([guessflux, np.zeros_like(xn).repeat(B)]) |
There was a problem hiding this comment.
I think this has the right length but the wrong order? i.e., you want something like np.concatenate(sum([guessflux[b*N:(b+1)*N], np.zeros_like(xn)] for b in B])
Claude spotted it and not me, but looks right? That feels important.
schlafly
left a comment
There was a problem hiding this comment.
This looks good. Several comments inline. One bigger question: when running on a multiband image, I think fit_im now returns a "multiband" output; i.e., both _joint quantities and the original quantities, now with _b0 appended. There's also one newer quantity, snr_joint.
I'm wondering whether in the single band case we should compress the output before returning, doing roughly:
cat = {k[:-3]: cat[k] for k in cat if k.endswith('_b0')}
Then I think e.g. DECam processing and single-band WISE processing would be less affected (the returned catalogs would be basically the same as before. You may have already addressed the single-band WISE case in wise_proc?
| mbd = in_bounds(xa, ya, [bdxf-0.5, bdxl-0.5], [bdyf-0.5, bdyl-0.5]) | ||
|
|
||
| # Expand to band-space indices for flux/guess (flat [B*N]) | ||
| mbda = np.concatenate([b*len(xa) + mbda_src for b in range(B)]) |
There was a problem hiding this comment.
I'm confused about what's happening with this one. mbda_src is boolean, but this looks like integers? I feel like this must be right otherwise there would be big issues, but I'm missing something.
| # Sources in this subregion (shared positions across all bands) | ||
| mbda_src = in_bounds(xa, ya, [bdxaf-0.5, bdxal-0.5], [bdyaf-0.5, bdyal-0.5]) | ||
| if not np.any(mbda_src): continue | ||
| mbd = in_bounds(xa, ya, [bdxf-0.5, bdxl-0.5], [bdyf-0.5, bdyl-0.5]) |
|
|
||
| ind = np.flatnonzero(mbda_src) # global indices into 0..N-1 | ||
| ind2 = np.arange(len(ind)) # local indices 0..M-1 | ||
| M = len(ind) # Its same as len(ind2) |
There was a problem hiding this comment.
I think we lost a bit of the logic here. It clearly doesn't make a big impact, but here's the intent:
- we cut the image into subregions
- each subregion has a 'primary' region; the union of the primary regions covers the image.
- it also has a boundary region; we include the boundary regions in the fits so that stars on the edge of the primary region aren't cut off and can be fit simultaneously with their neighbors. The primary + boundary region gets the 'a' suffix ('all')
- we do the fit on each subregion independently ('all')
- we only fill out the fluxes of the stars that were in the 'primary' region
I think the new code does all of this right except for 5.
|
|
||
| # Centroiding step (shared across bands) | ||
| N = len(xa) | ||
| if N > 0 and len(flux) >= B*N + 2*N: |
There was a problem hiding this comment.
Is this equivalent to N > 0 and psfderiv? I worry about relying on len(flux) since I guess there could be a lot of sky terms in there?
| flux_snr = flux_snr + (flux_snr == 0)*1e-20 | ||
|
|
||
| xcen /= flux_snr | ||
| ycen /= flux_snr |
There was a problem hiding this comment.
I think I personally would have done flux_avg = np.sum(flux * fluxunc**-2) / np.sum(fluxunc ** -2), and xcen /= flux_avg, ycen /= flux_avg. This is also fine though since it's the first iteration and we just need to do something sane.
Perhaps add some comments that this only applies to the first iteration since that's the iteration where we don't know the flux scaling.
What happens to the centroids for newly detected sources? It feels like they probably need similar handling?
| # "mad_y": float(mad_y), | ||
| # "rms_xy": float(rms_xy), | ||
| # } | ||
| # ) |
| # "ya": ya.copy(), | ||
| # "model": [ (models[b] + skys[b]).copy() for b in range(B) ], | ||
| # "sky": [ (skys[b] + mskys[b]).copy() for b in range(B) ], | ||
| # }) |
| # - Isophotal flux/centroid using PSF derivatives. Flux within brightness contour. | ||
| # Useful for galaxies, extended emission. |
There was a problem hiding this comment.
| # - Isophotal flux/centroid using PSF derivatives. Flux within brightness contour. | |
| # Useful for galaxies, extended emission. | |
| # "isolated" flux; fit flux & centroid as if the star were a single star, based | |
| # on imaged where neighbors have been subtracted from best simultaneous fit. |
This PR adds multiband source fitting utilities to crowdsource_base.py and updates wise_proc.py to run in multiband mode. Multiband functions are named after their corresponding single-band functions with "_multiband" appended.
Single-band run behavior is unchanged.