MV3D Development Blog

January 31, 2010

Storage Space

Filed under: Uncategorized — SirGolan @ 3:04 pm

I’ve been trying to come up with a better way to persist MV3D state info. Basically, I need a solution that meets the following:

Synchronous when saving data

Transactional

Extremely Fast when saving data

Maintains data consistency

Supports queries

Externally available (i.e. you can view/modify/query the data outside of MV3D)

Low CPU overhead

The current solution does support most of those. The four it’s probably worst about are queries, data consistency, CPU overhead, and external availability. It’s also just kind of wacky. You define a list of properties in your class that should be stored along with which of those properties you’d like to generate a search index for. When it’s time to save an object, it generates a uuid for it, then it stuffs all the properties you defined into a special object, pickles it, and then uses Axiom to persist it to sqlite. It uses index objects stored in Axiom which match the uuid with the value of the property in the object. Querying is also a little odd since you use magical properties on a query object (q.a == 12). Unfortunately, the object type you are querying for may not have an a attribute since the query object doesn’t really care. Another interesting part of the system is that it requires all servers to store their data locally. I’m not sure how I feel about this since some MV3D data really isn’t useful to other servers– at least as the design stands now. However, it does require that there be two servers for every item stored in order to recover from a catastrophic failure of a single server.

I’ve tried in the past to go with a completely Axiom based solution, but that didn’t work well because of the restrictions Axiom puts on any classes you mark up as Items that can be persisted. One thing I’d been thinking about was a dual object approach where you have the in memory object, and when it needs to persist, it stores all its persistable attributes into a specific Axiom class. This system could even use powerUps. This had a couple of downsides, one of which being that you had to define twice as many classes. It also ends up that you’d have to define your datastructure twice as well. The other option is to have a utility to generate that extra code, but I’m not a big fan of code generation, so I’d like to avoid that.

What I’m currently thinking of is adding a new service type to MV3D that would generally act like a data access layer (DAL). You’d mark up classes in Python with which attributes should be stored– and I’m thinking of combining this with the attributes that are sent over the network. In order to persist an object that had this mark up, you’d hand it over to this service. It’d then store the marked up attributes and return a unique ID you can use to retrieve the object later. While technically at this point, it really doesn’t matter where or how the data is stored, I’m going to go into that a bit. There would be several low level functions: register schema, upgrade schema, add object, update object, get object, and query. Basically, schema in this sense is the markup of a particular class which defines attributes to store and what type they are. For SQL based stores, this will be directly associated with a table. Schemas will be versioned so that when a newer version is registered, all data is upgraded. I see this happening by renaming the existing table, creating a new one with the new schema, and running code on each element to upgrade it.

After the schema is created, adding an object would just be a SQL insert and updating would be similar. Both of those could return the primary key of the row as an ID. Adding the primary key to the object’s class and the service’s location would give you a method of retrieving that object from anywhere. Querying for the user could make use of the class markup objects to give them the ability to do something like: store.query(Person, Person.name == “mike”).

Looking at the requirements I mentioned, the major ones that it fails are being synchronous and maintaining data consistency. I put synchronous when saving up there because I like to be able to wrap methods that modify data in a @autoStore decorator which stores the object after the method runs. If this were async, either that function would have to return a deferred, or you’d lose data consistency if there was a failure. It would also be sweet to be able to define the markup classes as descriptors which would a) store the object and b) update any network clients of the changes. A possible solution to this would be to strategically limit the remotely available functionality of the service to the low level operations I mentioned previously. Then to make them have nothing to do with converting objects into persistable data. This would require a local object/service to interact with the remote store. The local code could build a queue of transactions to send to the master store, and this queue could be persisted via sqlite. This way, the data would be synchronously stored to disk locally and then sent off to the remote store whenever.

One other issue here is that theoretically multiple servers could access the same object in the store at the same time. Yes, that sounds like a feature, but the aforementioned queue causes some issues with doing this. The data in the master store may not be 100% up to date. What might be interesting is to create a locking mechanism whereby when loading an object from the store, you acquire a lock on it. The hard part will be making sure that the lock goes away whenever you stop using the object and that failure conditions (such as your server crashing while holding locks on multiple objects) are properly handled.

Another issue is what to do if the store rejects a change that seemed like it would succeed locally. If we go with writes that return immediately and don’t wait for success, it would already be too late to tell the original caller that something went wrong. What are some of the reasons this would happen?

Remote storage server is down. Ok, just keep it in the queue until it’s back up.

Remote storage server is out of space. Keep it in the queue is probably best.

Schema mismatch or invalid data. This seems like the worst failure mode.

What do we do with a schema mismatch or invalid data? This indicates a fairly serious problem, so maybe it deserves a serious resolution– revert the object and all other objects related to the transaction that failed both on the remote store and in memory. That still doesn’t seem like a great way to resolve the issue, but it would maintain data consistency.

All in all, this seems like it’d be pretty challenging to implement, and it’d mean that I’d be maintaining a DAL/ORM as opposed to using a pre-existing one. I’m generally against re-inventing the wheel like that, but all the Python ORMs I know about are designed for webapps and don’t translate well to MMOGs.

Really at this point, I’m looking for someone to talk me out of this and tell me why it’s a horrible idea. Otherwise, I might just be crazy enough to try it out. One thing that’s bugging me is that the current mechanism works. I haven’t had any lost data issues or anything; however, I’ve found that sometimes it’s just easier to blow away the whole store than to say revert a change I made.

At the very least, I like how the new system abstracts out the DAL to a service that can optionally be on a remote server. This makes it so the underlying persistence technology can be pretty much anything. Another major benefit would be the ability to freeze objects by storing them and stopping simulation of them. Then you could unfreeze them later on a different server. What’s this good for? Well, making it so your character leaves the world when you log out for one. This isn’t currently possible without a load of hacks.

Comments (31)

January 24, 2010

Ship it!

Filed under: Uncategorized — SirGolan @ 3:09 pm

MV3D Version 0.40 will be officially released today. I’m building the Windows binaries and source release now. I’m happy with the release, but visually, it’s a bit of a step backwards. I don’t have any good screenshots to show, or a movie detailing the new features. There are new features though. They just aren’t visible, and if they are working properly, you should never need to worry about them.

The main theme of this release was completely refactoring the scalability and redundancy of MV3D at the core. This was a very big success. I now feel comfortable saying that MV3D is ready to scale. In a properly configured cluster, items will be both spread across the servers and have redundancy in the case of a failure. Items can be transferred from server to server based load balancing needs.

Looking at MV3D as a whole, there are still a few elements that need major work. The most obvious right now are tools, the client, and possibly data storage (sadly). It seems to me that tools to generate content are less useful if the data storage mechanism changes significantly (since either I’ll have to write a converter or you’ll have to lose all your stuff). On top of that, making the client look good and perform well is hard to do and mostly pointless without any content.

With that said, the logical order is to work on data storage first, then tools, and finally fix up the client. I’m hoping to not spend too much time on data storage. I have some ideas, and generally want to make whatever I come up with retain the same API. For tools, RED needs to be extended and generally made useful. There also needs to be a way to more or less export and import whole areas (or even whole realms) at a time. Any project bigger than a single person messing around will want to have this ability so that they can version control their world.

While fixing a recent bug, I came to realize that a tool for managing MV3D clusters is desperately needed. My repro steps for that bug involved checking in to SVN, updating two servers, restarting them, and then connecting to their SSH console. It would have been nice to be able to run 2 or more servers on my desktop for testing. So, this is another tool that should be forthcoming.

While I’m talking about that, the reason I had to repro the bug on the servers as opposed to in unit tests is because while there are unit test facilities for testing a full server against a real-ish client, there isn’t much in the way of full integration tests between multiple servers. Some of the problems I was tracking down ended up being issues that came up only when you started two servers, created some stuff, shut them down, restarted them, and tried to create more stuff. Integration test helpers for scenarios like this would be pretty awesome.

Another headache I ran into while rebuilding the demo servers was that since the starter world initialization code is written to work on a single server that provides all services, it doesn’t work well when you have a separate directory or account server. This made it so that I was basically typing the code on the console to create the world. No one should be expected to have to do that– a tool is needed that can bootstrap a set of servers to get them to the point where other tools like RED can be used to create a world.

There are plenty more things that need tools, but so I don’t bore people more than usual, I’ll move on to the client. The client seems slow. In reality, it gets 150-160 FPS, but it still feels sluggish for some reason. This is probably physics related, but it makes things look bad. The other big issue on the client is that since the server sends position corrections to the client’s avatar instead of the other way around, moving around can be a little clunky (even with the smoothing that’s done now if there’s a discrepancy). The UI on the client is very clunky and ugly. I’d really like to get that looking better. There’s tons of stuff really that needs help.

Sounds like a lot of work, doesn’t it? It definitely is. If you’ve gotten this far, you must be interested in MV3D, so why not contribute? There are a ton of open tickets now that would be excellent for someone looking to get started.

That’s all for now! If you try out the new release, just be warned that it’s visually the same as the last release. I’m hoping that as tools come online, I’ll be able to add more content to the demo world, but that depends on what code changes I have to make on the server to support the tools.

Comments (22)

December 13, 2009

No news is.. Well, no news.

Filed under: Uncategorized — SirGolan @ 10:02 pm

It’s been quite some time since I’ve posted anything here about MV3D. Things have definitely been going slowly recently in that regard, though they are picking up again now. I work on an MMO all day, so when I get home, I generally want to do something else for a while. However, now that I’m more settled in, I’ve been able to start putting in the time again after work and on weekends.

Mostly, I’ve been trying to put together some content editing tools since the in game editor isn’t really the place where you’ll want to put all your content together. It’s good for terrain sculpting and adding foliage and such, but I think I’d pull my hair out trying to use it to make a town. One of the major methods of building complex things in MV3D is the modular object system (demo’d in a few of the videos I think). I’m working on a tool so that complex modular object systems can be designed without having to write code. It’s coming along really well. At this point, I’m not sure it really fits in with MV3D, so I’ve been keeping it on the side.

My goal is to get content creation and some semblance of an art pipeline up to a point where I can create a passable looking game world. I’m not expecting anything too wonderful because I have no talent making 3D models (and am therefore limited to GPL or CC licensed ones that I can find), but I should be able to come up with something. Then after that, I’m going to start working on putting together a demo game.

At this point in the development of MV3D, I’ve often been unsure what the next most important thing for me to work on is. If I start putting a game together, I’ll be able to quickly see what areas need help and work on those. At least that’s the theory.

One thing I really want to fix up is the client performance. Last time I ran a load test, having 50-100 characters running around caused the client to become fairly unresponsive. Not surprising since I’m not doing a ton to mitigate that.

I’m sure some people are probably hoping I’ll talk about 38 Studios and Copernicus. About all I can say is “Yes, it’s awesome, and no, I can’t talk about it.” Ok, I can say more than that. We’re hiring! Check out the 38 Studios Jobs Page. It’s really a ton of fun working with such an incredible group of people.

Hopefully, with things picking up on MV3D again, I’ll be posting a bit more. I’ve also started an MV3D page on Facebook. Mostly, I’ll use that for random related things that don’t really warrant a whole blog post. I’ll probably also put pictures up there that I wouldn’t want to put on the main page.

Comments (23)

May 14, 2009

Lack of Windows 64bit support

Filed under: Uncategorized — SirGolan @ 11:13 pm

Seriously. I have XP 64 on my main system at home (ok, stop laughing) because I have 4gb of RAM and don’t want to make my Athlon X2 into a K6 by running Vista. Anyway, I’ve recently decided that it would be a good idea to move my music creation stuff from my old desktop to the new one. The old one just really can’t keep up any more and starts getting tons of lag when doing real time effects. Anyway:

M-Audio Midisport 1×1: no drivers for XP 64 (most M-Audio hardware: no drivers for 64 bit OSes)

Every MIDI loopback driver except LoopBe and MOLCp3: no support for 64bit OSes

Behringer FCA-202: no 64 bit drivers period (most Behringer hardware: no drivers for 64 bit OSes)

Cakewalk 64bit version: runs ok but seems to not detect some VST plugins (32 bit ones maybe?)

Cakewalk 32bit version: BSOD.

The last one is the one that really irks me. Apparently, the reason it blue screens is because Microsoft never made ASIO sound drivers work properly in XP 64 or Vista 64. Seriously? ASIO, by the way, is pretty much a requirement if you want minimal sound lag.

Apologies for the lack of updates lately. I’ve been plugging away with refactoring load balancing and am getting close to done. I’m considering another release shortly after that stuff is finished. I’ll probably put in some things I was hoping to get in to the last release, along with some bugfixes and then call it done. Other things have been keeping me busy such as my new job, which is completely awesome of course!

Comments (29)

March 22, 2009

LFG

Filed under: Uncategorized — SirGolan @ 3:20 pm

There hasn’t been a lot of visible progress on MV3D recently. That’s mostly due to changes in my employment, but also because I’m working on a monolithic project to revamp the load balancing. I think I’m half way through, but I’m not really sure. Anyway, for the past 3 years, I’ve been the only developer working on MV3D. Mainly, I wanted to make sure that the foundation was in place and it had a clear vision. After these load balancing changes, though, I feel like the foundation will be there. So, I’m looking for help.

The big areas are client side programming, tools / editors, and help taming Ogre’s art pipeline. However, once the load balancing is done, I’m going to start working on Siegium again, but this time it’ll be as a demo world for MV3D that has game logic.

If you are interested, comment/email/post on the forums/send carrier pigeons.

Comments (13)

March 4, 2009

Balancing act.

Filed under: Uncategorized — SirGolan @ 4:58 pm

I didn’t really go into detail too much as to how I solved load balancing and redundancy previously. In order to better flesh it out, I’m going to do so now. Please refer to the following diagram:

This is what I’m calling a cluster. It is what will be managing high availability in MV3D. Although there’s on big box there, the cluster is meant to run on as many CPUs as you can throw at it. I’ve just demonstrated a single node in the cluster for simplicity. At the core, it is a sharded pool. A sharded pool uses a plug in mechanism to link a given id with the pool it exists in. The default mechanism is a range based partitioning scheme. A basic example of this would be one that sends ids 1-1000 to pool A and 1001-2000 to pool B. That’s the general idea. Clusters use the default range mechanism to partition the ids into shards. These shards are themselves sharded pools. In this case, though, they use another mechanism that is optimized for a 1-1 mapping of id to pool. It is expected that the pools it maps to are redundant pools. That’s a whole new type. The redundant pool is as it says. All of its objects are duplicated across all the members of the pool.

So far, there’s a hole in the scheme. Where do these mystical ids come from? Anyone who has looked at MV3D knows that an id is either an integer (0) or a two integer tuple (0, 0) where the first number is the parent’s id and the second is the child’s id. For instance, in the case of an item, the first number is the realm id and the second is the item id within the realm. Anyway, these numbers come from an IDDispenser object. The cluster has one of those. It sits in a cluster-wide redundant pool. This means there is one master dispenser per cluster (and multiple redundant copies). This is where the ids come from. The redundant pool it sits in can also be used for other things that belong to the cluster. One particular use for this is to have an object which provides data for the pool. For example, an Asset Group would be a cluster, but there would be an AssetGroup object in the Redundant pool that has information about the group. To expand on that notion a bit further, the redundant pool for a Realm would contain the Realm object which contains physics properties.

I haven’t mentioned too much about the load balancing aspect of this though. Really, it’s fairly easy. The sharding of the top level cluster ensures that you can split up the management of the cluster across as many servers as needed. Then individual items in the cluster can be moved from pool to pool very easily to balance out the load. With directory, realm, and asset servers, the load balancing will be fairly manual. When you need more servers, add them to the appropriate cluster or pool. Load in these areas should be fairly predictable by standard means. The load balancing gets a lot more interesting when you start talking about simulated parts of the world.

Load balancing at that level is more tricky because the load isn’t predictable and is a lot more dynamic. 100 or more players could crowd into a shop where just moments before there were none. I’ll likely write about this in more detail with a future post, but the basic idea is to automate enough of the load balancing to make capacity planning predictable. MV3D will have a bank of servers partitioned to simulate a set of areas and a mechanism to balance that load across the servers. With that in place, capacity planning can be done by determining the load across the group of servers by changing which servers simulate a given area. Watching the increase over time should allow the operator to predict when more hardware is needed due to population growth. In general, all of your simulation servers would be in this group so that you can best make use of them. As I said, though, more on this later.

Comments (20)

February 8, 2009

Redundancy, check.

Filed under: Uncategorized — SirGolan @ 2:42 pm

After about a week of banging my head against the wall trying to figure out a better way to do redundancy in MV3D, I think I’ve finally got it worked out. I even have code to back it up. What I’ve created is a redundant pool of objects. Servers can join or leave the pool, and objects are replicated to all servers in the pool. One server is the master and the rest are slaves. If the master goes down, one of the slaves picks up the slack. The master can re-join the pool as a slave. It seems to work great.

The next step is writing a million more tests to make sure it acts properly in every situation I can think of. Once that is done, I think I can build on the platform I created to make different types of HA pools. For instance, one that divides objects up into sub pools.

Anyway, I’m pretty happy with this new method of keeping things redundant. It’ll take a while to integrate it into the rest of the code, but I don’t foresee any major problems. It is pretty cool though, I just implemented a method to move items from one pool to another. This could be used when changing the partitioning around, or (if I do one pool per area) when an item moves from one area to another.

Comments (11)

February 7, 2009

Unified theory of high availability.

Filed under: Uncategorized — SirGolan @ 2:57 pm

I’ve been working on ticket #211 for MV3D lately. If you looked at the ticket, you’ll see it’s something that’s been kicked around for a bit. The current redundancy and load balancing code is fairly broken in trunk. It did work at some point in the past, but a myriad of changes have caused it to break in various ways. Clearly, one problem is that there weren’t enough unit tests for it or this would never have happened. The larger problem is that the code was very repetitive and not very organized. I’m setting out to fix this, but it’s not been easy. I’ve written a bunch of code and tests, but I have no idea if it is remotely the right direction to go in. Not to mention that this is the 3rd or 4th major direction change I’ve had since I started on this ticket. Nonetheless, I really want to get these details worked out before MV3D gets much further along. The longer you wait on something like this, the harder it gets to fix in my experience.

If what I’m talking about makes no sense, read up on MV3D’s Server Architecture. The high level requirements are that there be no single point of failure for any service MV3D provides and that servers below the Directory servers can come and go with little disruption. Starting at the top, a Directory should be split horizontally (by item id) across a number of servers. Each part of the Directory should be replicated to several servers. Asset Groups and Realms should get the same treatment. Realms should organically divide items across Simulation Services based on the Area they are in. All Items should exist on at least two Simulation Services. To further complicate things, some types of Areas can spread themselves across multiple servers, which means the objects within them have to do the same. It is very important to keep objects in the same Area (or piece of an area) on the same server so that there is no time lag for physics collisions.

The current way of doing this makes each level in charge of distributing the load for sub-levels. This means that there is also no method to get HA Directory Services. The Directory stores a master and a list of slaves for each Realm and Asset Group. Asset Groups also have no redundancy right now and must exist on a single server. The Directory Service is able to promote and demote copies of Realms on various servers from master to slave. Realms manage a list of Areas and what servers they can be simulated on. Each Area manages where its objects are simulated. This generally makes for a big mess without much rhyme or reason as to where anything is served from. Another problem I’ve encountered is that having all HA items be pb.Cacheable makes it hard to switch them from slave to master or back without re-caching them (which would cause confusion to any object that kept a reference to the old version).

I was going to write about my solution, but after I started, it became clear that it was completely wrong, so back to the drawing board.

Comments (32)

January 27, 2009

Better late than never…

Filed under: Uncategorized — SirGolan @ 3:43 pm

I’m finally gearing up to make another official release of MV3D. This release includes quite a lot of new content creation features including:

Add and remove objects

Move and rotate objects

Select objects by mouse or from a list

Terrain Editor

Grass Editor

Object Factory UI (for creating complex objects using templates)

A bunch of big client features:

Configuration gui

Camera that doesn’t go through walls

New loading screen / progress bar

Windows install exe!

Plus some behind the scenes additions:

No more media directory (shaves 30mb or so off the download size)

Bumpers on the ends of the world so characters don’t fall off

Client requests items from the server instead of the server pushing them on the client

Lots of new unit tests! I think double the number in the last release.

Here’s a video I made for the release:

You can download the first release candidate of the MV3D Client for windows right now and connect to the alpha server. If you don’t already have an account, go to the MV3D login server and create one.

As a side note, the music in the video is the first piece I’ve recorded since I got my new bass guitar.

Comments (31)

December 1, 2008

MV3D Status

Filed under: Uncategorized — SirGolan @ 11:03 pm

I’ve actually had some time to work on MV3D lately, which I’ve been happy about. I did make a short video of the in game editor. It’s nothing wonderful (though I really like the music I came up with for it).

One of the things I did over the long weekend was to start making the client behave a little more like a client and less like a dumb terminal for the server. Previously, when you connected to a server, it would just shove all of the objects in your view range at you. Now, the client gets to pick. This has the added bonus that the client loads the images from closest to farthest. It can also limit the view range to less than the maximum to save bandwidth or cpu. The client is still limited to getting items which are in its view range by the server of course.

The other big thing I did– I’m still amazed I got this working– was to completely remove the need for the huge Media directory for the client. It’s about 65MB. I had to keep a few things. Just the GUI related files and the background image. Everything else is downloaded through the resource system in game like it is supposed to be. This should reduce the initial download size for the client by a lot.

That brings me to the next thing I’m working on, which is getting a Windows installer for the Client to work. I’m trying nullsoft to see how that goes. It seems pretty simple. I’d like to make some sort of script to put together the release using py2exe and such. Currently, there’s a bunch of manual work involved after the py2exe step.

Currently, there are 18 tickets left in the release. Hopefully I can get those finished up this month and get a new release out. That’ll depend on if I can keep feature creep out of there. I really want to make a simple “Add item from template” type deal in the editor since right now adding an item is rather useless.

Comments (28)

« Older Posts — Newer Posts »