Mardi, 21. Mai 2013
- PF_RING
- netmap
- netmap prez
- Intel DPDK
- Intel DPDK getting started
- user space network stack
- Direct cache access for high bandwith I/O
- Efficient network applications on multi-core linux
Mardi, 14. Mai 2013
vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...Ski (vignettes)
Contenu de la gallerie
Ski (grandes)
Contenu de la gallerie
Voir les commentaires...Tchoupi passe un week end a la maison avant de retourner a l'ecole.
vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...Lundi, 13. Mai 2013
vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...Lundi, 1. Octobre 2012
List of papers found about floating point hacks
- Christian Plesner Hansen on 0x5f3759df
- Mc Eniry on 0x5f3759df
- Lomont on 0x5f3759df
- Jim blinn's floating point hacks
- Matthew Robertson on 0x5f3759df
- Jan Kadlec on 0x5f3759df
- Chris Miller on 0x5f3759df
- cbloom : fast log and fast exp
- Nicol Schraudolph : fast exp
- Martin ankerl : fast pow
- Jose Fonseca's fast pow
- Harrison Ainsworth's fast pow
- Bruce Dawson tricks part 1
- Bruce Dawson tricks part 2
- Bruce Dawson tricks part 3
One weird thing is that some of those tricks work for x>1, but not for x in [0, 1[.
Tested between [0, 1[
invsqrt
With the following code
float myinvsqrt(float x)
{
union {float f; uint32_t u;} y;
// float xhalf = 0.5f * x;
y.f = x;
y.u = 0x5f3759df - (y.u >> 1);
// y.f = y.f*(1.5f-(xhalf*y.f*y.f));
return y.f;
}
I even removed the newton correction ! Here's the graph : inv sqrt approx
Pretty accurate, that one works !
log2
With the following code
float mylog2(float x)
{
union {float f; uint32_t u;} y;
y.f = x;
return y.u / 8388608.0 - 127;
}
Here's the graph : log2 approx
Pretty accurate, that one works !
exp2
With the following code
float myexp2(float x)
{
union {float f; uint32_t u;} y;
y.u = (x + 127) * 8388608;
return y.f;
}
Here's the graph : exp2 approx
Not very accurate, don't know if that works better for x>1.
pow(a, b)
By combining log2 and exp2, we can create an estimate of pow like this :
float pow(float a, float b)
{
union {float f; uint32_t u;} y;
y.f = a;
y.u = y.u * b - 127 * (b - 1) * 8388608;
return y.f;
}
Example with b = 8.125
pow(x, 8.125) approx
mmmm... not very good. Needs a Newton-Raphson iteration. By adding a single iteration
y.f = y.f * ((1-8.125) + 8.125 * x * powf(y.f, -1.f/8.125f));
This gives this graph : pow(x, 8.125) approx with newton
So better, only problem, I need powf for the newton iteration... So this is work in progress, hope to find a solution soon.
pow(x, -1/8)
I was able to derive the -1/8 power, here it is
float mypowm0_125(float x)
{
union {float f; uint32_t u;} y;
y.f = x;
y.u = 0x476983e4 - (y.u >> 3);
return y.f;
}
Here's the graph : -1/8 approx
That one needs a newton iteration to correct for the error.
Voir les commentaires...Mercredi, 30. Mai 2012
Photos en retard d'avril 2012 a San Francisco avec papi et mamie des Charentes
vignettes
Contenu de la gallerie
grandes photos
Contenu de la gallerie
Voir les commentaires...
Photos en retard d'Avril 2012 a NY avec papi et mamie des Charentes
vignettes
Contenu de la gallerie
grandes photos
Contenu de la gallerie
Voir les commentaires...
Ca y est, la petite puce n'a plus besoin des petites roues.
On dirait presque qu'elle a fait ca toute sa vie.
Papa, il va falloir investir cet ete !
La video.
Voir les commentaires...Mardi, 22. Mai 2012
Ce dimanche, Zoe faisait son spectacle de danse !
Quelques photos
vignettes
Contenu de la gallerie
grandes photos
Contenu de la gallerie
Voir les commentaires...Mardi, 17. Avril 2012
Dr.Dobb's articles by Herb Sutter :
- Free lunch is over by Herb Sutter
- Design for Manycore Systems by Herb Sutter
- Lock-Free Queues by Petru Marginean
- Lock-Free Code: A False Sense of Security by Herb Sutter
- Writing Lock-Free Code: A Corrected Queue by Herb Sutter
- Writing a Generalized Concurrent Queue by Herb Sutter
- Measuring Parallel Performance: Optimizing a Concurrent Queue by Herb Sutter
- Understanding Parallel Performance by Herb Sutter
Dr.Dobb's articles by Andrei Alexandrescu :
- Lock-Free Data Structures by Andrei Alexandrescu
- Lock-Free Data Structures with Hazard Pointers by Andrei Alexandrescu
Articles by Bartosz Milewski :
- Multicores and Publication Safety
- Who ordered memory fences on an x86?
- Who ordered sequential consistency?
- C++ atomics and memory ordering
- The inscrutable c++ memory model
Theoretical papers :
- CAS-Based Lock-Free Algorithm for Shared Deques by Maged M. Michael
- Obstruction-Free Synchronization: Double-Ended Queues as an Example by Maurice Herlihy, Victor Luchangco and Mark Moir
- Lock-free Dynamically Resizable Arrays
- Lock-Free and Practical Deques and Doubly Linked Lists using Single-Word Compare-And-Swap1
- An Optimistic Approach to Lock-Free FIFO Queues
- Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
- Split-Ordered Lists: Lock-Free Extensible Hash Tables
- High Performance Dynamic Lock-Free Hash Tables and List-Based Sets
- Practical lock-freedom
- Real-Time Computing with Lock-Free Shared Objects
- Scalable lock-free dynamic memory allocation
- simple, fast, and practical non blocking and blocking concurrent queue algorithms
- Hazard pointers
- Safe memory reclamation for dynamic lock-free objects using atomic reads and writes
- The repeat offender problem (ROP)
- Using Elimination to Implement Scalable and Lock-Free FIFO Queues
- J. D. Valois. Implementing Lock-Free Queues.
- Correction of a memory management method for lock-free data structures
- Allocating memory in a lock-free manner
- A Scalable Lock-free Stack Algorithm
- A Practical Multi-Word Compare-and-Swap Operation
- Scalable Queue-Based Spin Locks with Timeout
- Mostly Lock-Free Malloc
- Wait free SPSC bounded and unbounded queue with no barrier
others :
- Gcc atomic wiki page
- Lock free containers
- Atomic counter whitepaper
- Futexes
- Mutex and memory visibility
- Memory Barriers: a Hardware View for Software Hackers
- Introduction to lock-free/wait-free and the ABA problem here and here
- non blocking multi producers single consumer queue
- multi producer/multi consumer
- What every programmer should know about memory
- Understanding the linux kernel
- Articles at locklessinc
- Articles at 1024 cores
- Blog at 1024 cores
- Memory Ordering in Modern Microprocessors, Part I
- Memory Ordering in Modern Microprocessors, Part II
- What is RCU
- urcu
- My first RCU design
- Modern micro processors
- cbloom rants
- Flat combining
libraries :
- libsync
- atomic_ops
- gcc atomic builtins
- thread building blocks
- Atomic ptr plus project
- lib lock free data structure
- a scalable allocator : OSlash
- cpp framework
- lock free
- boost began impl of c++0x
- google perf tools
- boost impl of c++0x
- review of available atomic libs
- concurency kit
Mercredi, 21. Mars 2012
Zoe Lise Ewann part 2
Vignettes
Contenu de la gallerie
Grandes photos
Contenu de la gallerie
Voir les commentaires...Zoe-Lise-Ewann part 1/2
Vignettes
Contenu de la gallerie
Grandes photos
Contenu de la gallerie
Voir les commentaires...Photos en retard noel
vignettes
Contenu de la gallerie
grand format
Contenu de la gallerie
Voir les commentaires...Photos en retard noel
vignettes
Contenu de la gallerie
grand format
Contenu de la gallerie
Voir les commentaires...Photos en retard de l'anniv des petits
vignettes
Contenu de la gallerie
photos
Contenu de la gallerie
Voir les commentaires...
