Are there cloud / cluster / hosting providers that charge for actual CPU time used?_问答_开发者

Are there any providers that will only charge for 2 hours a day of computation? As far as I can tell from reading the various literature, Azure, EC2, and GAE will all charge for as long as the code is deployed to an instance, whether it is actually doing anything or not. I suppose this would work if we could automatically switch on an instance on a scheduled basis, and allow it to terminate itself once it's complete... But I can't find anything that allows this.

Background

We have a procedure that looks like this:

Every day at 6 am, download some data from a particular website
Perform a computation on that data. This computation lasts less than 2 hours
Send (HTTP) the results of that com开发者_Python百科putation to another website

We're looking to run this in such a way that it does not require any manual intervention. So every day, we would need at most 2 hours of CPU time. We'd love to host this somewhere that maximizes the efficiency of being idle for 22 hours out of each day (and charges accordingly).

Is there anyone that offers such a service?

Windows Azure exposes the management of applications (services) using a REST based API (http://msdn.microsoft.com/en-us/library/ee460799.aspx). You could write your own code using this REST API to manage your deployments. There are both commercial (Cerebrata Azure Management Cmdlets) and free (Windows Azure Platform PowerShell Cmdlets) tools available which can help you automate your deployment tasks.

App Engine has the Cron Service which can issue a request at a particular time of day. You could then use Task Queues to actually perform the work.

The only drawback is that tasks must complete within 10 minutes, which means you'd have to break up your processing into discrete chunks that can each finish in less than 10 minutes.

Unlike Amazon et al, App Engine doesn't charge based on time, it charges based on actual usage (e.g. number of queries per second you're processing, number of bytes used, etc).

I would say that Google App Engine's Backends sound like what you need:

Dynamic backends come into existence when they receive a request, and are turned down when idle; they are ideal for work that is intermittent or driven by user activity. For more information about the differences between resident and dynamic backends, see Types of Backends and also the discussion of Startup and Shutdown.

Backends do not automatically scale in response to request volume. Instead, you specify the number of instances of each backend, and change this number by performing an update or configure command. The number of instances is usually set in proportion to the size of a dataset or the degree of processing power you want to bring to bear on a problem. Cost may also be a consideration.

In your case, you could just bring up a backend at 6am, run a long-running computation (no 10 minute time limit ala Task Queues), and use the URLFetch library to send the results wherever you need it. The downside here is that (IIRC) if you aren't under the free quota, you'll be paying the $9 / month for the app (although you could use that to get a nice and beefy Backend).

Alternatively, you could just use Amazon EC2 and customize an image with all the code you need, store that in S3, and set a cron job to fire up an instance of it at 6am, run the computation, and kill it once it's done and you have the necessary results. Here, you'd only be charged for the two hours that it runs and a few cents extra to store the image in S3.

Amazon EC2 can do what you are looking for.

Amazon's Auto Scaling supports scheduling the number of instances you want running at any particular point in time. You can use this to have a single instance start at the time you want. You would configure the AMI/run parameters so that it runs your batch job at startup.

You probably don't need to schedule the stop time. Your batch job can simply shut down the instance when it completes the task, turning off charges.

Update: I've written an article describing the steps to accomplish this approach:

Running EC2 Instances on a Recurring Schedule with Auto Scaling
http://alestic.com/2011/11/ec2-schedule-instance

Thanks for the idea :-)

From my perspective Jelastic is what you definitely need as it meets the stated requirements.

First of all you can stop your environment at any time you need and it will not use any dynamic resources (RAM, CPU). Consequently you will pay for the actual used resources.

Alternatively there is a way to manage this request directly with the Hosting provider you choose.

Jelastic is planning to open API that will allow users to manage the cycle of their environments (stop, start) that can be either hosted on your code or realized from the browser as well. It is going to be fulfilled in the nearest future, so do not lose the chance to try it out.

Actually every provider that has hourly billing periods allows for that as long as you automate instance creation and deletion through API. You can find those in IaaS comparison engine, if in advanced mode you specify 2 hours a day in "time on" field.

One more pointer is Selectel - a Russian provider which charges per actual load of a resource. So if you pay half of CPU price if you have 50% of utilization. Don't know if you don't pay at all if it goes to 0. Similar is with RAM.