-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCM: Time evaluator gives negative deltaTime after STKDiscretization::writeSolution #2
Comments
I have not enough knowledge of LCM (and I think this is LCM-specific, right?). I don't see where we set |
I don't see where it's set to zero either. I agree with @bartgol that this may be an issue of how time is initialized in the parameter library. |
Didn't @calleman21 put this logic setting the parameter to 0 at some point to hack something within the thermo-mechanics problems? |
@ikalash Ah yes, I remember something like that. I don't recall that the time was set to zero, though, although I could be wrong about that. In any case, I'll try to look for that logic, and see if something can be done about it. |
OK, I've had a chance to look at this by playing around in your branch @JustinClough . The problem you bring up does not have anything to do with Coleman's changes awhile ago, as I was originally hypothesizing. The situation you bring up seems to occur in two situations: 1.) The LOCA stepper for some reason goes past the specified "Max Time" in the parameter list, then has to go back. This happens for SurfaceElement_SERIAL, for example, which you can see in the CDash: http://cdash.sandia.gov/CDash-2-3-0/testDetails.php?test=4895087&build=88800 . If you search for "writeSolution", you'll find:
You can see the second time is greater than the third. I don't know all the magic inner workings of LOCA (@agsalin or @rppawlo would be able to say more about it) but I think this sort of situation can happen due to the selection of the Predictor->Method in the LOCA parameterlist. The SurfaceElement problem has it set to "Tangent". If you remove the predictor or set it to "Constant" or "Secant" the issue will go away. 2.) The second situation is the one that affects most of the tests that fail in your branch. The issue is the "IP to Nodal Field" or "Project IP to Nodal Field" responses. Those responses do a solve after a computation completes, and the value of "Time" that the responses get is 0 rather than the actual value. The deltaTime computation gets called, and you end up with a negative deltaTime. I'd have to dig further to figure out why Time isn't updated in the responses. @lxmota may have some ideas. My guess is the value did not matter for those responses, and the deltaTime there is also irrelevant, so no one caught this. Does your case fall into either of the two scenarios above, Justin? Also, since you were saying you're not sure where the Time_old gets set, it is here: https://github.com/SNLComputation/Albany/blob/master/src/Albany_StateManager.cpp#L882 |
in the case of the IP to Nodal Field or Project IP to Nodal Field response).
that was showing up with the IP to Nodal Field and Project IP to Nodal Field respones. This fix cannot yet go in master, as it involves uncommenting fixTime() call, which I think is needed for some LCM problem - need to talk to Alejandro. There are still some failures due to a negative time issue - need to figure out what is going on with those.
Update: I found the issue for the IP to Nodal Field and Project IP to Nodal Field, and pushed a fix to your branch, @JustinClough . @lxmota , it looks like the following call to fixTime() messes up the time passed to these responses for some problems run quasi-statically: https://github.com/SNLComputation/Albany/blob/master/src/Albany_Application.cpp#L2835 . The fix over-writes the current time with 0. Perhaps you could look into this sometime? I don't want to mess with fixTime() in master b/c if I remember correctly, you introduced it b/c it was needed for some problems (perhaps even Schwarz?). Unfortunately there are still some failures due to some third unknown at this time issue:
I'll try to figure out what is going on with these. I'm curious if the problem you are running is affected by the third (unknown) issue, or one of the 2 I mentioned above (1.) should be fixable by changing the predictor method, 2.) should be fixed now in your branch). |
adding @etphipp for loca |
Regarding LOCA, if you use pseudo-arclength continuation, then Newton steps are constrained to be orthogonal to the predictor direction, which may include changes to the parameter throughout the nonlinear solve for a step (this is how it gets around turning points). When you reach a bound for the parameter, LOCA does its best to try and pick a step size so the converged solution will hit the bound, but because of the pseudo-arclength constraint described above, it may go past the bound slightly, in which case LOCA does one extra step of natural continuation (where the parameter is fixed). When the parameter is time, this can cause a negative delta-T. If you know your system does not have bifurcations, the simplest solution is to just use natural continuation. This is similar to using the constant predictor as described above, but results in a simpler nonlinear system. If you really need arclength continuation, you have to accept that there might be steps with negative time steps. |
Thanks for the explanation @etphipp . I did notice when I was playing around with it that for the cases where 1.) was the issue, changing to natural continuation fixed the problem. |
@ikalash My recollection is that the fixTime function was already there, and I only wrote a special case for Schwarz. Since Justin is not using Schwarz, he should not be affected if there is an error in that logic. Git blame says that I modified this function last, but that was probably just formatting, as I always apply clang-format liberally. In fact, to this day I'm not sure what the purpose if this function is, other that what it seems to do from the logic in the code (use the current time or a supplied value of time if there is one in the parameter list). I only remember that it was messing up the time for Schwarz, and thus the logic for it. |
Thanks @lxmota . It could very well be that git blame is misleading, as you say. Another question: were you the original author of setCurrentTime() and getCurrentTime()? That's another routine that leads to part of the problem (seems there is a missing setCurrenTime() call in the Albany::PiroObserver class that is needed given this logic in Albany::ModelEvaluator: https://github.com/SNLComputation/Albany/blob/master/src/Albany_ModelEvaluator.cpp#L691 ). git blame says it was Luca but I seem to remember that the routines were added for Schwarz - perhaps my memory is failing me (again)? |
There used to be two classes, PiroObserver and PiroObserverT. Could be that during the unification during the Thyra refactor I lost some details. |
Sorry for the delayed response. As far as I know, my situation is the second one you mentioned:
|
No worries. I haven't had time to look at this more since my last comment. In this case it should be fixed in your branch - please give it a try. You could also turn off the response to see if it makes the issue go away to double check that that is indeed it. I think you found something interesting and the behavior in master where the dt can be negative should not happen. I think putting an assert to prevent dt from being negative like you did in your branch, and fixing the cases where it happens would be worthwhile in master. Does anyone disagree? I'd still like to understand what is going on with the third case I mentioned that isn't covered by 1.) and 2.). Also my fixes in Justin's branch break a couple of tests for some reason and I'd like to understand that too - something I did must be a bit off. Anyway, lots to do, but hopefully with a temporary fix in place, it's not super urgent. |
The value of
deltaTime(0)
is negative afterSTKDiscretization::writeSolution(...)
is called. I think this may be fromworkset.current_time
being set to zero after the write here.This is an issue for me as the creep model uses this
deltaTime(0)
value to compute its state and I happen to be using Gmsh meshes which are built into STK.I made a commit on a separate branch that shows which other tests could be affected by this. I just added a simple Teuchos check for negative
deltaTime(0)
values.Any ideas on a solution or temporary workaround?
The text was updated successfully, but these errors were encountered: