Entropy, Information, Cloud & On-Prem, and farming (They look unrelated to each other, don't they? :)
Same thing from different angles, different field motivations, different associations and different thoughts that come to mind...
In today's blog post, I will try to share some interesting inference, that I made by attending some of Sean Carroll's (my favorite theoretical physicist and philosopher ) online tutorials about Entropy.. This is what I try, right?.. Trying to get the input from different fields, and use it in different contexts. Since 2013, I have tried to generate unique content, and I hope this blog post too will be in a similar wave length.
This blog post is about an analogy to of the things that I get from my physics research. (I 'm a physics fan -- for those who don't know it :) An analogy that concentrates on our industry. Actually, my main purpose in this post was to draw attention to the approach of different subfields to information and entropy. But I can't help myself from thinking that the inferences that I made on this way, may also apply to our Industry( Technology, IT...) as well.. In short; what I want to highlight is; different approaches to the same notion (due to different disciplines) may bring us a mismatch in associating two subjects together..
At the end of this post, I will also try make an analogy to the farming, exactly to the beekeeping :) This agricultural subject came up in our conversations with my friend and brother Barış Saltık, who is a highly respected senior computer engineer..
Okay, let's see what I mean by that.. Let's get into physics and communication theory quickly...
Everyone agrees; there's a close relationship between entropy and information.
However; different people from different sub fields may think the relationship goes in opposite ways..
For a physicist; high entropy is associated with the low information. That is to say -> if the entropy is high; we almost don't know anything about that thing (or that system). In a low entropy configuration, however; we say; we know a lot about that system.
This means; physicists tend to associate high entropy with low information content. (Note that; entropy is a distribution on phase space. (and information is where are you in phase space -- remember Boltzman Entropy ) --)
But! a communication theorist has a completely different notion. Actually it is all mathematically, formally the same, but for a communication theorist (like Claude Shannon), low entropy corresponds to low information & high entropy corresponds to high information.
See my posts on entropy for getting some background knowledge on this.. (background knowledge for our context) ->
https://ermanarslan.blogspot.com/2020/05/entropy-linux-kernel-csprngs-devurandom.html https://ermanarslan.blogspot.com/2022/04/ebs-poor-performance-in-autoconfig-due.html
Well.. As you see different people from different sub fields think the relationship (between information & entropy) goes in the opposite ways.
Thanks to Sean Carroll, we are aware of these interesting things :) -- I got all the information about what I mentioned in the paragraph above from Sean Carroll's publications.
As Sean Carroll mentions; they are both perfectly sensible ways of thinking but they are different ways of thinking depending on what you're thinking about..
Well.. I have to thing to add actually.. I know that physicists and communication theorists don't have the same reference points.. . But still yes! For example, when someone says high entropy and when we think of the first thing that comes to mind, we see that difference...
Let's check our world, I mean IT world.. Databases, Systems, Middleware, management, costs, efforts, stuff like that.. What about Cloud & On-premise. Here we are; finding a similar different thing ( or let's say trying to find a similar thing :).. Remember analogies are always dangerous, but they are also handy :)
Anyways; think about on-prem environment. (cloud ready environment, cloud-like modern on-prem environments are exceptions to this.) If we own the data center, and if we operate in a classical way and if we manage all of the components; we almost always prefer to build (& have) uncomplicated, plain, system architecture. Because; we think it is easy to manage such an architecture, it is easy doing diagnostics there. Moreover, it is not easy to get that knowledge to implement complex systems.. That is; for instance, we have a database and we put all the files (all the files that we read during the day) there as well :) We could position a Hadoop cluster for that task right? or an object storage or a cluster filesystem and do the integration, but it requires extra know-how, effort, and cost (at least for management & ownership).
So having a complex architecture makes someone with an on-prem point of view (again there are exceptions -> cloud-ready environments, public cloud-like on-prem environments) think that it is not efficient and preferable. However; for a cloud architect or someone looking from the perspective of the cloud with a cloud-ready or a cloud-native mode; having a complex architecture may be preferable.
It is similar right? In this case, classical on-prem folks may associate that complexity with a not preferred architecture, and as we say, cloud folks, on the other hand; may associate complexity with a preferred architecture..
Lets focus a little more on this to help you understand what I mean. I don't want you to think I am making all these things up :)
In modern Cloud environments (like Oracle Cloud Infrastructure, or Google Cloud Platform), we have the capability to use all the components we want. This really applies all the components, software components, virtual machines, storage servers , load balancers, firewalls, docker containers, Hadoop clusters, managed ETL/ELT processes etc.. you name it.. We consume the products of the capabilities of these components and we do it in various ways. I mean, we can have dedicated resources, or we can have as-a-service solutions. Most of the time, all of these components are managed by the cloud vendors, so we don't need to care the know-how required to keep these components up and running (and up-to-date). Support, diagnostics, patching, building DR environments (and lots of other things) ... We don't need to think about these things at all, because these are just one click away -- of course once you plan your cloud architecture and deploy your cloud resources accordingly.
As you may already know, having these amenities and possibilities save us time and we use that time for improving the technical things that affect the business side of things ( talking about a a good effect of course).
Well.. Those technical things may be; having a faster deployment process (a quicker deployment) or having a distributed infrastructure that serves users connecting from different continents with the resources in the relevant continent or a true scalable environment right? An auto-scale environment that can scale-in and scale-out automatically for instance.. -- Just think about implementing one of these ones in an on-prem environment.. Complex and hard right? However; in Cloud environments, it is easy (of course if you know what you are doing).
Of course there are other things that I mentioned before.. Loading the files in to the database all the time, really? What about managing them, what about making fast file I/O on a system that is built for database I/O? What about the data life cycle? What about using the expensive database resources for storing files you never read. What about building hadoop clusters? Kubernetes environments? Micro services? Sounds cool but hard right? What about the monitoring of all these stuff? The alert mechanisms? The high-end security? The data flow? ...
I think you get the idea.. So these are not a problem in cloud environments and that's why they are preferred there.
Actually it is a fact that; the complex architecture that can be built by putting all these stuff (modern stuff) at work occupied by a proper orchestration (I'm talking about something close to being fully managed) is preferred (universally:)! However; the difficulty of implementing and maintaining them in pre-prem environments prevents the preference of such complex architectures in on-prem environments.
But! as I just mentioned; the situation is different for the cloud. The benefits of complex architectures in Cloud environments can find space to show themselves there.. Complex environments where we position specialized components for exactly what is relevant, are naturally preferred because in the cloud; implementing and maintaining such complex architectures does not bring as much difficulty as on-prem.
Of course, even in the cloud, complex architectures may bring some difficulties, but these difficulties doesn't affect the decision that much, because their benefits overwrite them...
At the end; we say the almost same thing right? In the beginning; we said -> "Same thing from different angles, different field motivations, different associations and different thoughts that come to mind... " and at the end; we say -> "Same thing from different environments full of different realities, (again) different motivations and (again) different associations" :)
Just a note before I finish;
Being cloud ready eases the cloud migrations.. I mean, in cloud migration projects, we generally migrate the environment as-is.. After the migration, there are phases to make the environment cloud ready ( and maybe cloud-native).. These things are related with the SW stack mostly, but there are things that need to be done in the infra layer as well and all these things can be deduced from what I wrote above :) Anyways; if it is not a modernized arch, it will not get full benefit from being on cloud, because its cloud footprint won't be in that complex (and modernized) mode at day1..
Maybe you already noticed , maybe you didn't.. That 's I would want to point out that; this text seems impartial , but in fact, it tells you be cloud-ready and modernize your whole stack. It whispers to you; "turn your concentration this way and start getting the benefits".. Sooner or later it will be your choice, anyways..
Before I finish, I want to make one last analogy :) It is in the context of farming - agriculture, - beekeeping.
While bees collect nectar and pollen, they always go to the same flowers at certain times. They have to develop a strategy according to the flower type in their focus. That is; they calibrate their tongue for a particular flower type. So it is difficult for them to collect pollen and nectar from other type of flowers in that process. On the other hand; another hive focuses on another flower type, and for that hive, the flower that the bees I mentioned earlier focus on, seems difficult and impractical, until that hive has to change their flower type. In fact, correct flower (easier ones) type is determined by the habits of the bees.
However; the future of the hive is not about focusing on easy flowers in the short term. It is about being ready for adaptation according to the amount of nectarine. Of course the speed of that adaptation is also important. Good hives are those which can adapt quickly. + At the end of the day, you produce new hives from successful hives, not from the other ones :)