My idea is: 1) Determine how frequently you want to update the number on screen. Every frame is too often, once per second is too infrequent. Maybe 4× per second is OK. 2) Determine how much time to average over. One or two seconds seems fine. 3) From those two numbers, compute number of buckets. For example update 4× per second, average over 2 seconds, gives you 8 buckets. 4) Every time a frame is drawn, based on current time, you either a) increment the counter in the last bucket or b) shift all buckets over one position and increment the new last bucket from zero to one. 5) Average the sum of frames drawn over that time period you decided earlier. 6) Don't display fraction of a FPS. 7) Done. It costs you 8 integers, plus maybe one more for a timestamp of the last bucket, plus maybe one more to implement ring buffer instead of shifting all those integers over. Less than a single CPU cache line.
Is there a way to have benefits of both? Version 7 for better database clustering. And version 4 for complete randomness? So users can not inference nothing from the id? I have an idea: Use version 7 internally, then scramble it before sending to the user. Scrambling could be done by the database or by the server application. It could be as simple as XOR with some 128bit constant, or as resilient as AES encryption. Of course you also need to do unscrambling of IDs coming from users.
If privacy is the main concern (as it is in most usage of UUIDs) you could just encrypt the integer primary key instead with something like feistel and avoid the performance problems of UUIDs while still having opaque public identifiers.
On division: There is a paper about a division algorithm. If you split your 128bit integer into four 32bit integers, you can use a 64bit CPU built-in instruction to do the 128bit division. If you are on 32bit CPU that doesn't have the 32bit division operation, then split your 128bit number into 8 16bit integers and leverage the 32bit CPU built-in operation. https://surface.syr.edu/eecs_techreports/166/
And having a CPU with "128-by-64" or "64-by-32" division instruction (such as x86/x64) simplifies the algorithm even further.
Inputs: uint[128] N, uint[128] D
Outputs: uint[128] Q, uint[128] R
such that Q ≤ N, R < D, and N = Q * D + R
Uses: (uint[64] quot, uint[64] rem) hw_divmod(uint[128] dividend, uint[64] divisor)
if D.high = 0:
# long division; quotient can be 128-bit wide, but remainder will fit in 64 bits
uint[192] N' := N ≪ lzcnt(D.low)
uint[64] D' := D ≪ lzcnt(D.low)
uint[64] Q', uint[64] R' := hw_divmod(N'.high ≪ 64 + N'.mid, D')
uint[64] Q'', uint[64] R'' := hw_divmod(R' ≪ 64 + N'.low, D')
Q := Q' ≪ 64 + Q''
R := R'' ≫ lzcnt(D.low)
else:
# Quotient will fit in 64 bits, but remainder could be 128-bits wide
uint[192] N' := N ≪ lzcnt(D.high)
uint[128] D' := D ≪ lzcnt(D.high)
uint[64] Q', uint[64] R' := hw_divmod(N'.high ≪ 64 + N'.mid, D'.high)
uint[128] R'' := R' ≪ 64 + N'.low
uint[128] T := Q' * D'.low
if R'' < T:
Q := Q' - 1
R := D' + R'' - T
else:
Q := Q'
R := R'' - T
R := R ≫ lzcnt(D.high)
It's pseudocode. A mix of Python's block structure, C's style of variable introduction, and Pascal's use of ":=" for assignment and "=" for equality. Oh, and assigning to output variables instead of explicit return is from Go, although the tradition is much older, of course.
As for using "'" in variable names, can't recall from the top of my head what language it's from (I don't think it's Lisp), but something does that.
Nice, that's what i noticed; I rather like its clean structure.
But since it was put together by you/not-you, does it have a name and is it used as a Specification/IR/DSL language somewhere (since you gave the algorithm in it) ?
No, Windows has stable API and ABI named Win32 API. This comes from times of 16bit Windows and works also in x64 / ARM / ARM64, previously it worked in Alpha / MIPS / IA64. This API is implemented in kernel.dll user.dll gdi.dll advapi.dll and similar, it does some stuff, but mostly forwards to the NT API. Beware, kernel.dll is user space component despite its name (historical reasons). NT API is undocumented and not meant to be used by user programs, it is not stable, it lives in ntdll.dll, it does syscalls to the kernel: ntoskrnl.exe. Windows doesn't have a libc (for user programs, it has private one for its own programs), Visual Studio has a libc. Each version of Visual Studio (roughly) has its own libc named msvcrt.dll msvrt100.dll msvcrt140.dll and similar, it hosts the C and C++ libc, it could be linked statically for various benefits and drawbacks.
A big motivation for it was security posture; it means that Microsoft can now ship security updates to UCRT that everyone can rely on rather than a ton of extra surface area through various multiple versions of runtime libraries.
It had (has) an unsupported, crippled, unversioned msvcrt.dll which if you used it very carefully with a subset of functions, you could write programs which worked fine on Windows NT and up.
FWIW, large swaths of ntdll are documented and supported these days, for performing work that can't be expressed via the win32 API like raw disk manipulation.
What about HW decoders? Of video formats (that are much more complex than image decoders)? Such as MPEG, H.264, H.265, VP9, AV1 and similar. If memory allocation is needed here and there, I guess there is always known maximum size of such allocation in advance. Written in the spec. Think of at chip design time, or at software decoder compile time. How else would HW decoders be even possible?
Also: Hey Google, do you even fuzz-test? Your own stuff?
The article says that GLONASS time counts days from zero to 1461, then resets back to zero. This is 4 years if we think leap year is once 4 years. But it is not. Leap year is once 4 years, but each 100 years it is not, except each 400 years it is. This is on average 365.2425 days per year not 365.25 days per year. Is GLONASS not long term thinking ahead enough?
GLONASS transmissions have fields for earth orientation parameters that are too small to allow for unbounded DUT1: it has a built-in assumption that its system time is close to earth rotation angle.
What about NaN? Does Erlang have NaN? How does Erlang compare two variables containing a NaN (or expressions evaluating to a NaN)? There are (2^24)-2 different NaNs in a 32bit IEEE 754 float, and there are (2^53)-2 different NaNs in a 64bit IEEE 754 double. The IEEE 754 standard says, that a NaN value should compare to any other number (including any other NaN) as not equal. Yes, the expression `x == x` could return `false`.