Adds throttling based on on average bandwidth usage per HTTP service.
Since only HTTP textures are using this, they are still starved by other
services like inventory and mesh dowloads. Also, it will be needed to
move the maximum number of connections per service the to the PerService
class, and dynamically tune them: reducing the number of connections is
the first thing to do when using too much bandwidth.
I also added a graph for HTTP texture bandwidth to the stats floater.
For some reason the average bandwidth (over 1 second) look almost like
scattered noise... weird for something that is averaged...
Note that HTTPTimeout::sClockWidth is no longer used for HTTPTimeout (as
if it's value became a constant of 0.01, the fraction 10ms / 1s).
A new variable, HTTPTimeout::sClockWidth_10ms is used to calculate
sTime_10ms (only).
Also HTTPTimeout::mStalled and HTTPTimeout::mLowSpeedClock changed units
to the same as of sTime_10ms (time since epoch in 10ms units).
Rationale: LL is doing all throttling per service (host:port), not per
service hostname. Also, textures and capabilities use the same host: the
sim you are connected to. Splitting the queues up on a per-service basis
will stop the textures from blocking a capability request.
After commit things compile again :).
The HTTP bandwidth throttling is not yet implemented. I'll put a
temporary fix back the next commit that just does it the "old way"...
This includes also non-texture requests (not sure if it's worth it to
explicitely separate this data for just textures).
The texture console now prints: HTTP:c/q/a/r
Where, c = number of 'add' request commands in the command queue
(minus the number of 'remove' request in the command queue and not
taking into account the command_being_processed, nor entirely being
thread-safe with regard to adding requests to 'q' or 'a' next: it is
possible that a request is no longer counted in 'c' but not yet is added
to 'q' or 'a'.
q = number of queued (throttled) requests (by the curl thread).
a = number of actually added requests (to the multi handle).
r = last returned value of 'running handles' by libcurl.
Obviously, c and q should be small (0 or 1) most of the time and a and r
should be equal and maxed out. This turns out to be the case.
Renamed PerHostRequestQueue and PerHostRequestQueuePtr to
AIPerHostRequestQueue and AIPerHostRequestQueuePtr respectively and
moved them to global namespace. This in preparation for using them in
the texture fetcher (as function of the hostname of the url that is
contructed there).
This new variable is updated to contain the total number of requests in
MultiHandle::mAddedEasyRequests. It can be a static because MultiHandle
is a singleton (there is only one curl thread) and it has to be an
(atomic) static because we don't want to need to take the lock on
MultiHandle for accessing this variable.
The new variable is updated to contain the sum of the size of
PerHostRequestQueue::mQueuedRequests of every PerHostRequestQueue
object, and thus the total number of queued requests for all hosts
together.
Also added PerHostRequestQueue::host_queued_plus_added_size() which
returns the the sum of queued requests plus the requests already added
to the multi handle for this particular hostname.
This adds a size (command_queue_st::size) to the ThreadSafe deque<Command> that was
AICurlPrivate::command_queue that is kept equal to the number of 'add'
commands in the deque minus the number of 'remove' commands in the
deque. The deque itself is now command_queue_st::commands.
Replaces LLTextureFetch::getNumHTTPRequests.
Returns AICurlInterface::Stats::running_handles.
This is work in progress that temporarily doesn't compile because
LLTextureFetch::getNumHTTPRequests is still being used somewhere.
All HTTP timing is done by AIHTTPTimeoutPolicy.
Inside LLTextureFetchWorker::doWork when mState == SEND_HTTP_REQ,
mCanUseHTTP is true, throttling is not in effect and mURL is not empty,
mLoaded is set to FALSE, mState is set to WAIT_HTTP_REQ and
LLHTTPClient::request is called that starts the download by curl.
A call back to LLTextureFetchWorker::callbackHttpGet is guaranteed,
which causes mLoaded to be set to TRUE (HTTPGetResponder::completedRaw
calls LLTextureFetchWorker::callbackHttpGet which sets mRequestedSize to -1
(if there was an error) and mLoaded to TRUE).
Being in state WAIT_HTTP_REQ, once mLoaded == TRUE (and mRequestedSize
is -1), the different timeout errors are handled.
This fixes https://code.google.com/p/singularity-viewer/issues/detail?id=705
Adds 'bool redirect_status_ok(void) const { return true; }' to LLIamHere,
because it's ok to receive a 302 status there. Likewise added to LLIamHereVoice,
because that has the same comment in its error() method.
Also fixes the problem that if two redirects occur on a row, then the
upload_finished detection asserted because it would detect the third
time that libcurl turned off writing to the socket as a failure (the
second time wasn't a problem because mUploadFinished was reset upon
receiving the first 302 header, but not upon receiving the second
header).
This is now necessary since the curl thread no longer syncs with the
main thread: it is possible that a request finishes after a texture
fetch thread was shot down but before curl was stopped, and curl
calling BufferedCurlEasyRequest::processOutput while objects that the
responder uses were already destructed (most notably
LLTextureFetch itself).
This value really IS in bytes/s (not for the total), and apparently 56
kB/s is too optimistic. The value was used by LL for transfers that went
beyond the total download time (2 minutes?), adding enough seconds per
received bytes to the timeout to allow a download at 56 kB/s to finish.
Our meaning is different however: we time out immediately and whenever
the download drops below this speed.
Perhaps a better algorithm is where the speed demand is based on the
total size of the download, but I'm not sure we always know the size of
downloads at this point.
When uploading finishes, but is not detected, the timeout should be for
"reply delay", the time that the server takes before it replies, and not
CurlTimeoutLowSpeedTime. This patch adds code that takes this failure
into account (which happened only ONCE for me on Metropolis while flying
around and using trickle (not sure if that is relevant), so it's not
that likely to improvement anything in practise. Note that it is
detected by an assertion when it happens, so that we can safely assume
it normally never happened on SL).
* Generalized PUT / POST configuration by adding
CurlEasyRequest::setPut, which now also supports keep-alive (which
still isn't used).
* Upload content length is now stored in CurlEasyRequest::mContentLength
* CurlEasyRequest::has_stalled() now return false if it was possbile
that the 'upload finished' detect failed AND calls upload_finished()
itself in that case, so it is no longer 'const'.
* If low speed is detect exactly when the last bytes are being attempted
to be sent (unlikely scenario), then the upload gets 4 more seconds
after which is switches to CurlTimeoutReplyDelay.
* Added EDoesAuthentication and EAllowCompressedReply to replace
booleans, for readability and type-safety, as did EKeepAlive. Note
that this change inverts the meaning of the compression related parameter.
* Unrelated: removed an unnecessary #include "llurlrequest.h" from
llxmlrpcresponder.h
The responder name is now cached in LLURLRequest
(ResponderBase::getName() must return a string literal).
The run time (in the main thread) per state machine is now accumulated
in AIStateMachine (instead of AIEngine::QueueElement).
When AIStateMachine::mainloop runs longer than StateMachineMaxTime
then a warning is printed that now includes the time spent in the
slowest state machine (that frame) and (if it is a LLURLRequest)
what the corresponding responder is. Also the total accumulated run
time of that state machine is printed.
From this is can be concluded that the only responder currently
regularly holding up the main thread is LLMeshLODResponder (mostly 30 to
100 ms, but with spikes in the 1 to 2 second range some times).
This fixes a bug where unref() was called when a state machine was
aborted before it reached bs_initialized. Debug code was added to detect
errors related to that.
In order to run HTTPGetResponder in any thread, I needed direct access
to LLHTTPClient::request, so I had to move that to the header file,
and therefore had to move ERequestAction from LLURLRequest to
LLHTTPClient to avoid include problems.
With this, textures are fetched with no latency: call to
LLHTTPClient::request runs all the way till the state machine is idle
(AICurlEasyRequestStateMachine_waitAdded). There is small delay till the
curl thread wakes up, which then processes the request and opens the url
etc. When the transaction is finished, it calls
AIStateMachine::advance_state(AICurlEasyRequestStateMachine_removed_after_finished)
which subsequently doesn't return until the state machine is completely
finished (bs_killed). The LLURLRequest isn't deleted yet at that point
because the AITimer of the LLURLRequest runs in the main thread: it is
aborted, but only the next time the main thread state engines run that
is deleted and the timer keeps an LLPointer to it's parent, the
LLURLRequest, so only then the LLURLRequest object is destructed. This
however has nothing to do with the texture-bandwidth loop.
Removed AICurlEasyRequestStateMachine_added and
AICurlEasyRequestStateMachine_finished because the state machine is not
taking any action there anyway, and those states might be skipped all
together even, so they make no sense / shouldn't exist.
This reverts commit ef35aa7954
because it contained too much wrong things that I won't be
using. I'll re-commit stuff from it after that that I do
want to keep.
This work extends AIStateMachine to run multiplex() in the thread
that calls run(), cont() or set_state(). Note that all three
eventually call locked_cont(), so thats where multiplex() is called
from. Calling multiplex() means "running the state machine", as in
"calling multiplex_impl".
Currently only LLURLRequest uses this feature, and then only
for the HTTPGetResponder, and well only for the initializing,
start up and normal finish states.
A current/remaining problem is that we run into a situation where
the curl thread runs a statemachine to it's finish and kills it,
while the main thread is also 'running' it and tries to call
multiplex while the statemachine isn't running anymore.