Cloud computing provides very good and easy to use feature to an organization, but at the same time it brings lots of question that how secure is the data, which has to be transported from one place to another in cloud. So, to make sure it remains secure when it moves from point A to point B in cloud, check that there is no data leak with the encryption key implemented with the data you sending.
The security laws which are implements to secure data in the cloud are as follows:
Input validation: controls the input data which is being to any system.
Processing: control that the data is being processed correctly and completely in an application.
File: control the data being manipulated in any type of file.
Output reconciliation: control the data that has to be reconciled from input to output.
Backup and recovery: control the security breaches logs and the problems which has occurred while creating the back.
Cloud computing is going all together for a different look as it now includes different data types like emails, contracts, images, blogs, etc. The amount of data increasing day by day and cloud computing is requiring new and efficient data types to store them. For example if you want to save video then you need a data type to save that. Latency requirements are increasing as the demand is increasing. Companies are going for lower latency for many applications.
To optimize the cost and other resources there is a concept of three-data-center which provides backups in cases of disaster recovery and allows you to keep all the data intact in the case of any failure within the system. System management can be done more efficiently by carrying out pre-emptive tasks on the services and the processes which are running for the job. Security can be more advanced to allow only the limited users to access the services.
Cloud data center doesn't require experts to operate it, but it requires skilled people to see the maintenance, maintain the workloads and to keep the track of the traffic. The labor cost is 6% of the total cost to operate the cloud data center. Power distribution and cooling of the datacenter cost 20% of the total cost. Computing cost is at the end and is the highest as it is where lots of resources and installation has to be done. It costs the maximum left percentage.
A user should know some parameters by which he can go for the cloud computing services. The parameters are as follows:
1. User should know the data integrity in cloud computing. It is a measure to ensure integrity like the data is accurate, complete and reasonable.
2. Compliance: user should make sure that proper rules and regulations are followed while implementing the structure.
3. Loss of data: user should know about the provisions that are provided in case of loss of data so that backup and recovery can be possible.
4. Business continuity plans: user should think about does the cloud services provide him uninterrupted data resources.
5. Uptime: user should know about the uptime the cloud computing platform provides and how helpful it is for the business.
6. Data storage costs: user should find out about the cost which you have to pay before you go for cloud computing.
Cloud computing platform has various databases that are in support. The open source databases that are developed to support it is as follows:
1. MongoDB: is an open source database system which is schema free and document oriented database. It is written in C++ and provides tables and high storage space.
2. CouchDB: is an open source database system based on Apache server and used to store the data efficiently
3. LucidDB: is the database made in Java/C++ for data warehousing. It provides features and functionalities to maintain data warehouse.
Cloud computing has many providers and it is supported on the large scale. The providers with their databases are as follows:
• Google bigtable: it is a hybrid cloud that consists of a big table that is spilt into tables and rows. MapReduce is used for modifying and generating the data.
• Amazon SimpleDB: is a webservice that is used for indexing and querying the data. It allows the storing, processing and creating query on the data set within the cloud platform. It has a system that automatically indexes the data.
• Cloud based SQL: is introduced by Microsoft and it is based on SQL database. it provides data storage by the usage of relational model in the cloud. The data can be accessed from the cloud using the client application.
There are many platforms available for cloud computing but to model the large scale distributed computing the platforms are as follows:
1. MapReduce: is software that is being built by Google to support distributed computing. It is a framework that works on large set of data. It utilizes the cloud resources and distributes the data to several other computers known as clusters. It has the capability to deal with both structured and non-structured data.
2. Apache Hadoop: is an open source distributed computing platform. It is being written in Java. It creates a pool of computer each with hadoop file system. It then clusters the data elements and applies the hash algorithms that are similar. Then it creates copy of the files that already exist.