Sessions can be attacked. That’s a fact. And there are many ways to attack them. Three of the most common ways to do this are “session fixation” , “session hijacking” and “session flooding”. In simple words, session fixation is about tricking someone to use a session ID that does not belong to him/her. Session hijacking is about stealing a session ID that belongs to someone else and use it for our own purposes. Finally, session flooding scenario involves automated clients (bots) that try to initiate a vast number of fake/empty sessions with outermost goal to make the server run out of resources and either, consequently, take it out of service or considerably degrade its performance.
Measures against session fixation attacks:
– no session ID in URL should be used as transient mode ( session.use_trans_sid = 0 )
– unininitialized sessions IDs should not be accepted ( session.use_strict_mode = On )
– session ID should only be stored in cookies ( session.use_only_cookies = 1 )
* in red letters you see the relevant PHP configuration parameter
Session hijacking: Although my intention here is not to go into details on how these attacks can be implemented, I will briefly mention some of the ways that session hijacking (the main attack vector to sessions, nowadays) can be achieved:
(b) session sniffing: e.g by monitor unencrypted network (usually, Wi-Fi) traffic.
(c) session prediction: trying to predict the session ID that a user may have.
(d) man-in-the-middle attack: someone that manages to intercept communication between the browser and the server is able to manipulate the traffic making the server and client talk to him while believing they talk to each other.
Though the ways a session attack takes place is quite essential in order to understand how can these attacks be mitigated, we, the developers, are usually most interested in the counter-measures and techniques that we can apply to defend. So, here is a list of such measures that can be used to ensure the security of our sessions:
– make session IDs as much unpredictable as possible. It is advised to create IDs by using a mechanism provided by a library of framework and not to built our own mechanisms.
– use secure communication (HTTPS)
– use the HttpOnly flag for session cookies (this will help you keep your session cookie privacy against cross-site scripting)
– change (regenerate) the session ID after any upgrade in privileges (e.g after login)
Note: the session ID regeneration should happen before we add sensitive information (e.g auth credentials, user info) to the session
– regenerate session ID periodically, especially for security sensitive content
– store session ID and remote IP and compare for successive pages/requests
– use HSTS (the HTTP Strict-Transport-Security response header) to prevent session hijacking in case of sidejacking or a Man-In-The-Middle scenario.
Session flooding attack: The simplest scenario is having an HTTP client sending requests to our web application without including the session cookie in any of them. Our application will try to start a new session for each of these requests which not only takes up more and more storage space but also consumes CPU and memory resources for this process.
No much information can be found out there on how to mitigate this threat. Maybe, because this kind of attacks are sometimes being handled in the network layer by routers or firewalls. In any case, one idea is to keep track some information about the client for which we are setting up a session (e.g IP, User-Agent,…) and bound the number of sessions that can be created per client. Be a little bit flexible on that boundary since it is not easy to uniquely identify each browser. Especially, in the case where multiple users are behind NAT.
The “regeneration” issue
Regenerating the session ID is not…that simple. Or… it shouldn’t be, ideally. And that’s because some things may go wrong in the whole process. Let’s distinguish two specific cases. On an unstable network, the server will try to set the new session ID via a cookie, but Set-Cookie packet may never reach the client due to a network glitch (e.g think about a mobile phone with bad signal). So, the session cookie will not be modified in the browser and, in the next request, the client would try to use the old session ID. A similar situation may occur with concurrent access to a web site. As you may know, browsers often use multiple connections to the server in order to save time by making concurrent requests through these connections. A new session ID may be issued to one connection ( by the session_regenerate_id() function ) but at the time the new session cookie arrives at the browser another request may have already been sent by a second connection. This request would not have the new session cookie but the old one. So, again, the second request will arrive after the regeneration and it will try to access the old session. So, as proposed by the PHP documentation: “You must not prohibit access to old session data immediately after session_regenerate_id(). It must be prohibited a little later. e.g. A few seconds later for stable wired network. A few minutes later for unstable network such as mobile or WiFi.” There is one more reason for this. Potentially malicious access cannot be detected with immediate active session removal. The conclusion is that we should never call session_regenerate_id(true) or session_destroy() for an active session.
On the other hand, we should not let the deletion of old sessions to their fate. As, again, proposed by the PHP documentation, we should not rely on session ID expiration by session.gc_maxlifetime. Attackers may access victim’s session ID periodically to prevent expiration and keep exploiting victim’s session ID including authenticated sessions. Instead, we must implement time-stamp based session data management by ourselves. When the short term expiration time (time-stamp), set for the old session after regeneration, expires, then the session is considered obsolete. If a user accesses an obsolete (expired) session, deny access to it. Obsolete session data access could be an attack (or maybe not, for reasons we explained in the previous paragraph). To do this, you must keep track active sessions per user. If it is an attack or we cannot track active sessions per user, it is recommended to remove authenticated status from all of the users’ session because it is likely an attack.
If we can keep track of active sessions per user, we can also notify the user on how many active sessions exist for him/her, from which IP (and area), how long is active, etc. There are number of ways for such an implementation. You may setup a database that keeps track required data and store information to it. Since session data are subject to garbage collection, you have to be cautious in maintaining the active session database consistency. One of the simplest implementation is “User ID prefixed session ID” and store required information to $_SESSION. Many databases have good performance for selecting string prefix. You can use session_regenerate_id() and session_create_id() for this.
Be careful: Never use confidential data as prefix. If user ID is confidential, consider to use hash_hmac().
Some very nice code samples that illustrate the workings of timestamp-based sessions can be found in the official documentation: http://php.net/manual/en/function.session-regenerate-id.php In any case, these are some things to take into account:
Performance Considerations and Scaling issues
Locking: Every time a session starts the session file is being locked by the operating system in order to prevent data corruption. This is the behavior of the file session handler but not only. If you are storing sessions in Memcached using the PHP’s Memcached extension you will notice a similar “locking” behavior. Of course, in this case, Memcached allows us to disable this locking mechanism through configuration. Using a relational database as a storage means that we have to provide a custom implementation of a session handler and there is no locking by default. We will have to programmatically set the class or method responsible for reading/writing your session data by using the session_set_save_handler() function. So, implementing a locking mechanism is being left to our implementation.
Is locking a problem ? Generally speaking, it depends on whether your application may try to send concurrent requests from the same browser. Sending more than one AJAX requests in parallel should not be considered something unusual. Due to the locking mechanism, they would have to be executed one after the other. So, it’s all about trade-offs.
ALL concurrent requests will ALWAYS be subject to race-conditions
UNLESS you do file locking, at which point they are not concurrent requests anymore.
Even if you don’t currently have concurrent requests, it is good to try to decrease the possibility for such blocking situations. These are two ways to achieve this:
– use read-only sessions when session data update is not required (PHP 7)
– close sessions as soon as you have finished updating session data (PHP 5 + 7)
session_commit() or session_write_close()
– make cookie-less requests (or don’t start a session), if you don’t need the session at all
Be always careful when examining whether your code tries to write session data after the point you end a session with any of the above functions. Sometimes, this writing is not obvious (e.g writing session data as part of captcha mechanism or CSRF protection).
Our last resort can be to deactivate the locking mechanism, if possible. That means that we understand the implications. We understand what race conditions are and why no harm can (possibly) be done in our case. But keep this possibility only for cases where locking causes A LOT OF TROUBLE. Disabling locking means that we either have to use a session storage that supports disabling (e.g Memcached) or override the default mechanism by a custom implementation. An example of such an override for PHP’s default file handler can be found in the PHP documentation: http://php.net/manual/en/class.sessionhandlerinterface.php
A nice idea that can reduce racing issues in case we decide to deactivate locking, is “Auto-merging” and comes from Oscar Merida. You can read it here: https://www.phparch.com/2018/01/php-sessions-in-depth/
Garbage Collection (GC) : There is not any specific best configuration for the frequency of garbage collection. It depends on the application traffic. What we need to keep in mind is that, for systems with many active users, garbage collection is a resource expensive operation since we have to iterate over many session files. Activating GC too often may have a significant impact to application’s performance. Activating it very infrequently can lead to very high number of session files.
So, the best configuration can be the result of a try-and-error procedure.
Pay special attention to the fact that not all PHP installations are the same. Some operating systems may behave a bit tricky. For example, Ubuntu sets session.gc_maxlifetime to zero in order to deactivate random GC and delegates the session file clean-up to a cron job (see an example of how this unexpected behavior can lead to problem: https://nystudio107.com/blog/the-case-of-the-missing-php-session ).
Of course, in high traffic systems, sessions are almost never kept in the filesystem. For distributed applications, saving sessions to filesystem means using a distributed file system or a centralized one that will be used by all application servers. This is a heavy maintenance burden, especially, when better solutions exist, like using a database (usually, a key-value one).
Serialization: Session data are stored in $_SESSION as an associative array. Before storing (no matter if the session storage is the filesystem or a database) they need to be serialized (converted into a string representation). PHP uses a specialized serialization method which is different from the normal serialize() function. If you are curious, you can get this string representation of your session data using the session_encode() function. You can also change the serialization method, using session.serialize_handler configuration parameter. Does it worth to replace the serialization method with a third-party library ? Maybe, but the main reason is not the serialization speed or string length that is produced. You may need to do that if you share sessions with non-PHP systems and you need them be able to decode this string.
Serialization is the reason why we should avoid storing objects in session. If you do, here are some things to take into account:
– special attention should paid in the implementation of __sleep and _wakeup functions that are called during the (un)serialization process. What is more, you cannot control the (un)serialization of an object stored in session by implementing the Serializable interface. As I have already mentioned, the serialization method used for sessions is a bit different than the one used by serialize() function.
– static properties are not serialized.
– the object’s class should be available anywhere in the application where the session is used. Which means that by the time the session_start() call is made, the class definition should have already been encountered by PHP or, alternatively, it can be found by an already-registered autoloader. This can be an issue if you are sharing sessions with other non-PHP systems.
– if you update a class and instances of this class are stored in the session storage, then the new class version should be backward compatible to the old one. Otherwise, the unserialization process that loads the session data will fail.
So, my advice would be, try to avoid storing objects in sessions. If you have to, then it is wiser if you manually serialize them before storing them in the session.