output at flexible time levels#217
output at flexible time levels#217guoqing-noaa wants to merge 11 commits intoufs-community:gsl/developfrom
Conversation
…ttribute Introduce a new `output_timelevels` attribute for MPAS streams that enables variable output intervals. With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days. We can also use this to only write out forecast files during a given period, such as: output_timelevels="6-12h" Check the PR description for details on how to specify the time levles. Here is a quick example: output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"
|
|
The following situations have been tested: |
output_timelevels stream attrbute|
For a conus12km 12 hours forecasts, I tried different settings for the the |
SamuelTrahanNOAA
left a comment
There was a problem hiding this comment.
These code changes are overcomplicated and will be difficult to extend or maintain. Any significant changes to the functionality will require a complicated rewrite.
More specific overview comments:
- Most of the parser code is a reimplementation of the Fortran standard
splitfunction. It would be better to callsplit. - Much of the time-related code reimplements the MPAS and ESMF alarm functionality. It would be better to extend the alarm functionality.
- Parser errors aren't reported. If you type something wrong, you don't know what it was.
- There are out-of-bounds array reads due to off-by-one errors.
- The new syntax deviates from the MPAS days_HH:MM:SS time specification.
- Unlike the rest of the MPAS I/O interface, this uses a cryptic string instead of XML tags. If you used XML tags, the parser code wouldn't be necessary at all.
If you want to use a parser generator, the original grammar specification should be in the repository, not the automatically-generated code. |
No, I don't intend to use a parser generator. I put EBNF there to make sure the overall timelevel specification logic is consistent and complete, and there are no hidden surprises. |
@SamuelTrahanNOAA Thanks for the comments and discussions! This is the first version to get things work. I also feel the implementation is kind of complicated and I would like to descope to meet our current needs only. My thought is to limit the timelevel specifications to only support minutes and hours (may also limit the output to have to start at 0h). I intentionally try to avoid using the same time format |
It is, indeed, a finely-polished EBNF with perfect indentation. However, it doesn't describe the code. Your |
What is more user-friendly is splitting them into individual XML tags so it is clear what is going on. That would use the existing MPAS parsing code (eliminating the parser). Also, it'll be less confusing to experienced MPAS users. <output_interval start="01:15:00" stop="02:30:00" step="00:15:00"/> <!-- every 15 minutes from 1:15 to 2:30 -->
<output_interval start="06:00:00"/> <!-- Only at hour 6 -->I can see how your string would be easier to pass through the rrfs-workflow bash scripts, but bash limitations shouldn't take precedence over MPAS code consistency and quality. |
|
Can you please provide:
|
|
@clark-evans I am sorry, this may not be ready for an official review yet. I should have put this in the draft mode since the beginning. |
This looks good. Does current MPAS code support this? |
No. I'm suggesting it as an alternative implementation to your recursive descent parser. |
|
@SamuelTrahanNOAA Thanks for lots of great discussions! For the moment, this PR is mainly for a test/demo purpose. I think my initial need is much simpler that the goal in this PR. We want to output For our current needs, I think NCAR Andy Stokely's changes may work. I will test that first. I was not aware of Andy's changes until 2 days ago. |
|
WRF was limited to start...step...end for each stream. We were able to work around that by having multiple output streams. (Recall that the operational models HRRR, HWRF, NAM, and RAP were all WRF-based, along with other quasi-operational models.) |
|
@guoqing-noaa @SamuelTrahanNOAA For example, output_fh = 0.2 0.4 0.6 0.8 1 2 3 6 9 12, would provide output every 12min for the first hour, every hour for the next three hours, and every three hours for the remainder of the simulation. |
@dustinswales It is good to know this capability in UFS.
Thanks! |
I believe output_fh would be just need to be really long in this case,output_fh: 0.25 0.50 0.75 1 2 3 4 5 .... 72 75 78 .... 168 174 180 ... 240. I'm sure we could find a way to distill this information if it become a pain.
This is mostly true.
This is going to be the biggest change for Standalone users when moving to inline MPAS. We use the MPAS framework for native input/output, but we do not use the MPAS stream manager as is done in MPAS standalone. |
|
FYI: NOAA GSL doesn't do climate or seasonal research. Hence, the capability to output on multi-week timescales won't affect our work. It is a good question to ask, though, since others in NOAA do seasonal forecasts. I'd expect forecasts out to 90 days for some applications, just not ours. (I just wanted to clarify things since the word "climate" showed up in discussion.) |
@dustinswales I think it is totally fine to enumerate all output hours inside the model. However, it will be very beneficial for users for specify this using a simplified interface, for example, |
| ! Use actual write time for filename (each output gets unique file) | ||
| call mpas_get_time(writeTime, dateTimeString=time_string) | ||
| call mpas_expand_string(time_string, blockID, stream % filename_template, temp_filename) | ||
| else if ( stream % filename_interval /= 'none' ) then |
There was a problem hiding this comment.
Lines 3342-3346: same as lines 3263-3267, but handle the situation when a stream has been created (and no need to start from scratch)
| else | ||
| call mpas_set_timeInterval(filename_interval, timeString=stream % filename_interval) | ||
| call mpas_build_stream_filename(stream % referenceTime, filenameTime, filename_interval, stream % filename_template, blockID_local, test_filename, ierr=local_ierr) | ||
| end if |
There was a problem hiding this comment.
Lines 3777-3960: all these changes are in read_stream(...). It allows we read a stream correctly for an input/output stream with output_timelevels configured.
| ! Use actual time for filename | ||
| call mpas_get_time(now_time, dateTimeString=when_string, ierr=err_local) | ||
| call mpas_expand_string(when_string, blockID_local, streamCursor % filename_template, filename) | ||
| else if ( streamCursor % filename_interval /= 'none' ) then |
There was a problem hiding this comment.
Lines 4191-4195: similar as before, now in mpas_get_stream_filename(...)
| ! Write output_interval to stream | ||
| ! | ||
| IF (stream %filename_interval(1:4) /= "none") THEN | ||
| IF (stream %filename_interval(1:4) /= "none" .and. stream %filename_interval /= "output") THEN |
There was a problem hiding this comment.
If we use output_timelevels, we don't have a fix output_interval, so skip writing output_interval to stream
src/framework/mpas_stream_manager.F
Outdated
| ierr = 1 | ||
| end if | ||
|
|
||
| end subroutine parse_all_timelevels!}}} |
There was a problem hiding this comment.
lines 6037-6123: Parse the entire output_timelevels string into a timelevel_spec structure
This is a helper function and rewritten using mpas_split_string(...). I hope it is much easier to read now.
src/framework/mpas_stream_manager.F
Outdated
| return | ||
| end if | ||
|
|
||
| end subroutine parse_time_string!}}} |
There was a problem hiding this comment.
lines 6126-6215: This is also a helper function.
Parses time strings like "1h30m", "45m", "6", "90s", "1d" into total minutes
Supported units: d/D (days), h (hours), m (minutes), s (seconds)
Plain integers without units are interpreted as hours
src/framework/mpas_stream_manager.F
Outdated
| 'expected format is start-stop-step, but appears to be start-step-stop', MPAS_LOG_ERR) | ||
| end if | ||
|
|
||
| end subroutine parse_output_timelevel_spec!}}} |
There was a problem hiding this comment.
lines 6218-6341: This is also a helper function.
Parses a single segment using format: start, start-stop, or start-stop-step
Time strings can be integers (hours) or duration format (e.g., "1h30m", "45m")
Examples: "6", "0-3", "0-1h-15m", "1h30m-2h-15m"
src/framework/mpas_stream_manager.F
Outdated
| ierr = MPAS_STREAM_MGR_ERROR ! Signal no more output | ||
| end if | ||
|
|
||
| end subroutine get_output_interval_from_timelevels!}}} |
There was a problem hiding this comment.
lines 6344-6441: this is also a helper function:
Given a pre-parsed timelevel_spec and current forecast hour, determines the appropriate output interval in minutes. It handles both the situations of itemized timelevels or time level ranges.
src/framework/mpas_stream_manager.F
Outdated
| ierr = MPAS_STREAM_MGR_ERROR ! No future ranges | ||
| end if | ||
|
|
||
| end subroutine get_next_timelevel_start!}}} |
There was a problem hiding this comment.
lines 6444-6502: a helper function again.
Given a time levels string and current forecast hour, finds the start hour of the next timelevel range (if any). Returns -1 if no future ranges exist.
src/framework/mpas_stream_manager.F
Outdated
|
|
||
| if (present(ierr)) ierr = local_ierr | ||
|
|
||
| end subroutine update_variable_output_alarm!}}} |
There was a problem hiding this comment.
lines 6505-6597: This is the core function of this PR, update_variable_output_alarm
It updates a variable output alarm after it has rung by:
- Checking if the stream uses output_timelevels (via timelevel_spec)
- Calculating current forecast hour (current time - start time)
- Getting the next interval from get_output_interval_from_timelevels
- Removing the old alarm and creating a new one with the updated interval
| ierr_c = 1 | ||
| end if | ||
|
|
||
| end subroutine stream_mgr_add_variable_output_alarm_c !}}} |
There was a problem hiding this comment.
lines 7022-7166: This part provides two functions stream_mgr_set_property_c and stream_mgr_add_variable_output_alarm_c for the xml_stream_parser.c to use at the beginning time when read the streams.atmosphere XML file: decode the output_timelevles part, set the stream properties and create an initial alarm if output_timelevles is defined.
| *status = 1; | ||
| return; | ||
| } | ||
| snprintf(msgbuf, MSGSIZE, " %-20s%s", "output alarm:", "variable (from output_timelevels)"); |
There was a problem hiding this comment.
The changes on xml_stream_parser.c are more straightforward, most are copy and paste and then minor changes to match the output_timelevels situation. Feel free to add comments for any questions.
In a nutshell, this PR attaches to the main code base a new orphan island, which does not affect any existing capabilities if output_timelevels is not used. When output_timelevels is used, it enables the capability to output at flexible time levels.
SamuelTrahanNOAA
left a comment
There was a problem hiding this comment.
I wouldn't approve this for merge to the NCAR authoritative MPAS, but I am willing to approve it for merge to the GSL branch as a temporary workaround. GSL JEDI/MPAS data assimilation development needs this feature, and we can't stall much longer. This is our only available solution, it works, and the fork will probably still be manageable.
Note that the UFS MPAS has a totally different I/O subsystem, so it will replace this workaround eventually.
My objections are:
- The overall length and complexity of the changes will make it harder to maintain the fork.
- Using a custom string format necessitates an over-complicated parser since Fortran is ill-suited as a string processing language.
- A string list instead of XML tags deviates from the MPAS design in a way that reduces flexibility and end-user understandability.
- The implementation restricts the number of output time specifications to a fixed number rather than using proper data storage, such as a linked list.
- Sending the output intervals from C to Fortran after the rest of the tag adds lots of code to send that information later. Had the output interval been sent at the same time as the other attributes, this would have been much shorter.
|
@guoqing-noaa we'll provide a status update on this at the 4/6 RFS meeting. Re: the changes to mpas_stream_manager.F, what do you think about potentially moving some or all of the new subroutines (I counted eight new subroutines) into a new, separate file that is linked in at compile time? I think the new parsing subroutines might be good candidates for this. This would be akin to mpas_sort.F, which separates out search and sort subroutines, and mpas_string_utils.F, which provides a couple of Fortran string utilities; both of these are in src/framework/. |
That's a good suggestion and doable. I will try to update it by the end of tomorrow. Thanks! |
joeolson42
left a comment
There was a problem hiding this comment.
I admit this is an abbreviated review, but at a superficial glance, for a temporary solution, it looks OK.
|
@clark-evans I moved 6 subroutines from
|
|
I have tested the new updates and worked as expected for the previous test cases. |
Thanks, @guoqing-noaa! I'll give it all a look-through today (hopefully). |
clark-evans
left a comment
There was a problem hiding this comment.
I concur with @SamuelTrahanNOAA and @joeolson42 reviews. Moving most of the new functions to a new mpas_output_timelevels.F file is a big help for maintaining the code base, particularly if/when NCAR provides a suitable native implementation of this functionality. I'd like @dustinswales to review before merging, though.
dustinswales
left a comment
There was a problem hiding this comment.
@guoqing-noaa Thanks for this contribution!
This is quite complicated and there is a lot of code to review, so it would be good to add a new test, or extend an existing test(?), that exercises this functionality.
Going forward we should ensure that any new functionality developed is added to our testing harness, otherwise we won't know if it works, or if/when it breaks.
|
@dustinswales Thanks for the great suggestion! |
dustinswales
left a comment
There was a problem hiding this comment.
@guoqing-noaa That works for me.
It's pretty straightforward, so maybe I can take a stab at it sometime this week.
@dustinswales Thanks a lot for being flexible! For information, in most of my tests, I used the following settings:
I have a PR coming soon which will add a new stream attribute |
Introduce a new
output_timelevelsattribute for MPAS streams that enables variable output intervals.With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days.
We can also use this to only write out forecast files during a given period, such as:
output_timelevels="6-12h"Check the PR description below for details on how to specify the time levles.
Here is a quick example:
output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"FYI, with this PR, we may write out mpasout files at
"0 1 2 3 6 9 12", while history files at"4 5 7 8 11 13-99999"to avoid duplicate output of both mpasout and history files at the same time levels, which are almost the same.This PR solves issue #214
Priority Reviewers