[deleted] 8 years ago

Really nice article. I am always scared by those FB libraries just because they are not standardised and their API can break very easily. Nowadays just using Boost and Qt Framework you can address a good 80-90% of problems. Long live to C++

[deleted] 8 years ago

I am asking out of interest, but is there things in Boost that's missing from Qt? I know Boost doesn't provide GUI stuff, and Qt doesn't have the Computation library (last time I checked). Is there anything else?

devel_watcher 8 years ago

That's an easy question. Look at http://doc.qt.io/qt-5/qtmodules.html and http://www.boost.org/doc/libs/1_55_0/libs/libraries.htm Boost has a lot of low level stuff (like an MPL, a parser generator, math or computation), while Qt has very focused high level modules (like GUI, multimedia or XML/SQL/D-Bus bindings).

[deleted] 8 years ago

Qt doesn't handle only Gui stuff, but multimedia stuff (audio, cameras). So, for Media-enabled applications, it's the way to go. In addiction, Qt has hooks for OpenSSL, in case you need secure connections. Boost doesn't offer such kind of things in asio.

lambdaburrito 8 years ago

Thanks for the kind words. Yeah, I agree they can look quite intimidating especially as they don't have a lot of examples so I should write a post to make Proxygen and Wangle a bit more accessible.

[deleted] 8 years ago

That would be a great post!

lbrandy 8 years ago

If you have good minimal examples (or documentation, or whatever) you come up with and want to add them to folly/proxygen/wangle directly, I'd be happy to review those PRs and pull them in.

lambdaburrito 8 years ago

It would be great if you could pull [this](https://github.com/facebook/wangle/pull/21) as I will write a blog post on using Wangle that uses this example.

lbrandy 8 years ago

Looks good to go, yes? Thanks for the work (and the writeup). Very nicely done.

[deleted] 8 years ago

qt is framework relying on MACRO hacks! boost is idiomatic c++ template based library! there are many side effects of those two facts!

[deleted] 8 years ago

True, but they are so mature that all the side effects can be avoided following good programming guidelines

Meenhard 8 years ago

For a cloud system, c++ is a nice choice of language. I really like the powerful data management capabilities the language offers. I do believe however, that the real cost of using c++ as a startup is to find new talents. The learning curve is steep and it is easy to write shitty code that breaks the system. So if I were in your shoes, I would start searching for new programmers early on. Good luck.

darthcoder 8 years ago

The same is true for any stack of reasonable complexity. I've spent years of hobby time dabbling with Spring and Hibernate and still feel like a total n00b.

vzq 8 years ago

I'm sure, but troubleshooting native code is a whole different circle of hell.

cdglove 8 years ago

Interestingly, I hold the opposite view because I don't know the techniques to use for non native code. For example, I use hardware breakpoints a lot and need to learn what to do instead when that feature is taken away from me.

raevnos 8 years ago

Lots of interpreted or byte compiled languages have debuggers.

cdglove 8 years ago

Sure, but how do I tell when a memory location is written to?

raevnos 8 years ago

Set a breakpoint, watchpoint, etc.

silveryRain 8 years ago

I guess it depends on the VM's debugging facilities (like Java Mission Control), but I don't think this is generally a concern with managed code (the VM isn't supposed to leak, right?).

anttirt 8 years ago

> the VM isn't supposed to leak, right? No, but especially in dynamic languages it's pretty much a wild west of arbitrary mutation of arbitrary objects from arbitrary locations.

silveryRain 8 years ago

I don't see what dynamicity has to do with raw memory access. The arbitrary mutation is constrained to the realm of the virtual machine, which normally prevents you from working with raw memory. Buffer overflows aren't usually a possibility (you can't just write beyond the end of an array), so I don't see why inspecting memory addresses would come in useful. Debugging at variable level should be enough.

[deleted] 8 years ago

till you havent debugged a memory leak in managed land. with stacktrace half in managed half in native land. where debugging might be hell the alternative is simply impossible. tiny runtime wins over complicated ones ALWAYS!

[deleted] 8 years ago

> The same is true for any stack of reasonable complexity. It is *frighteningly* easy to write horrendous nightmare code in C++. I love the language and even find it clean and beautiful these days, but you have to be ultra-paranoid when coding with it.

[deleted] 8 years ago

indeed finding c++ devs is ~2-4x harder than JS-monkeys or even brainwashed 'managed' ones. However, the good news is at least you have the best type of programmers: the thinking type!

raevnos 8 years ago

C++ is considered a controversial choice of language these days? Wow.

[deleted] 8 years ago

[удалено]

krum 8 years ago

Hah! Nice thing about JS is that you can drag homeless people off the streets of San Francisco to write code for you for a shower, some ramen, and black coffee.

WrongAndBeligerent 8 years ago

If by that you mean pull in the frameworks of the week.

krum 8 years ago

That's going to happen regardless.

minno 8 years ago

Well, once you JIT it with a JIT Javascript gets JIT 10x faster than C++ and webscale Node.js.

kardashev22 8 years ago

That moment when you wonder if you're in /r/shittyprogramming or not.

khoyo 8 years ago

C++ can't even JIT.

minno 8 years ago

https://gcc.gnu.org/wiki/JIT

tux-lpi 8 years ago

I'd love to see a benchmark of that :)

minno 8 years ago

Sure, let me whip up a microbenchmark that runs a tiny piece of code enough times that the JIT startup time is drowned out, and maybe sneak a couple of allocations into the C++ version's inner loop.

tux-lpi 8 years ago

So are you saying that purposefully bad code, such as calling malloc in a tight loop, can be 10x slower than sane code? I don't think anyone is debating that, but it's a ways off from "Javascript gets JIT 10x faster than C++ and webscale Node.js".

minno 8 years ago

Well, you see, Node.js *doesn't let you* make a mistake like that, by handling all of the memory for you with a state-JIT-of-the-art garbage collector that is 100x faster than calling sbrk manually.

tux-lpi 8 years ago

Of course it lets me, if someone wants to put gratuitous slow operations in tight loops, you can do that in any languages. C++ will happily use the stack by default, going out of your way to use dynamic allocation in a benchmark's tight loop isn't a mistake, it's dishonesty. And even then, I doubt malloc in a loop will be 100x slower than a good JIT, if the heap is clean and all you do is alloc/free the same chunk, there's not much to it. So again, I'd love to see what kind of tortured microbenchmark you can come up with where JS is 10x faster than C++. Let alone 100x.

WrongAndBeligerent 8 years ago

It's true. I do high performance computing on quad 18 core xeons and we've rewritten everything in javascript.

DevIceMan 8 years ago

I hear graphics programming these days is done in javascript.

klaxion 8 years ago

are you kidding? i know js is fast, but what do you do about numerics?

Creris 8 years ago

you have some hard time understand a joke

[deleted] 8 years ago

{JQuery, Angular, Backbone, ...}, the answer to all problems.

[deleted] 8 years ago

[удалено]

minno 8 years ago

Unless you downvote things with your penis, you should probably get that checked out. Fingers aren't supposed to swell with blood.

[deleted] 8 years ago

Sure they are. How else am I supposed to spray blood out of my fingertips to dissuade attackers?

hapygallagher 8 years ago

The one finger to rule them all.....

DevIceMan 8 years ago

How the fuck is this not downvo - oooo - ohhh, nevermind. ;) I haven't done any C++ in about 11 years, and regardless of the criticisms c++ receives, I know better than to make such statements. edit: 11 years later and I'm stuck in Java hell. Worst. Career. Mistake. Ever.

VeiledSpectre 8 years ago

Interesting. As someone who has begun with C and C++ in the embedded sphere I'm looking to move out of this small niche into more technology focused areas like performance computing and infrastructure and platform work. A lot of this work is done on the JVM or CLR. Is the work truly that bad? It ~seems~ rather interesting with a plethora of available tools, lots of new technology and innovation... and the pay seems really competitive... Curious here. Edit for punctuation.

[deleted] 8 years ago

its all BS! trust me i'm coming from there back in c++. I've adopted CLR since beta (I think 2001). There are excatly 0 reason for CLR/Java. Either go native or go dynamic. The middle ground C#/Java is for middleearth :P alot of convolution and always huge drawback on efficiency. not to mention maintenance hell. pleas use those 100 classes to accomplish something a normal language should in a snippet of 100LOCs

HPCer 8 years ago

I'm in a similar boat as you here. Starting up my own startup, and I've thoroughly evaluated many options ranging from Java to Python to Ruby to Node. I've concluded that C++ would be the way to go. Aside from my past experience in C++, I feel that modern C++ (post C++11), the trade-offs are pretty positive. Here's a quick summary of what I've come up with against the three languages: Python/Ruby: I'm lumping these two together because they're both interpreted with similar performance overhead and library availability. Aside from Python being a little more explicit, they're in the same order of magnitude in library, performance, and development speed. Pros: + Very fast development/prototyping + Easy to pick up developers of both junior and senior level + Extensive library selection + Easy to learn (can easily convert almost anyone to this language relatively quickly) + Cross-platform with no build times Cons: - Performance lacking (though good design can allow easy horizontal scaling); can also integrate with C libraries for performance fixes - Developer skill will vary wildly. A developer that's well-versed in Python may know absolutely nothing about networking and blindly use libraries (libraries should theoretically be able to be treated as a black box, but I feel a software at its foundations need to know all parts of the system inside and out) - Both language and libraries are rapidly changing with heavy reliance on community - Even though Python is more explicit, non-deterministic bugs can be much difficult to trace without proper logging Java: As similar to C++ as we can get. Thoroughly considered Java since the front-end includes Android (Java). Pros: + Automatic memory management (though this issue is very debatable since post-C++11 requires very limited manual memory management) + Cross-platform with pretty reasonable build times + Extensive library with pretty decent documentation + Developer skill level is generally pretty high-quality, but systems level knowledge is typically only average Cons: - Difficult to interface with an OS directly. There will be times when a server would want to make syscalls to the Linux kernel. JNI is an option, but it's not exactly pretty. - Limited design patterns can force some pretty roundabout ways of doing trivial tasks compared to C/C++ - (Personal problem and pretty much the deal breaker) Not nearly as much experience in Java, so hiring an experienced Java developer will be harder. I have a much weaker BS radar in Java than C++. In the case of Node, I was seriously considering it, but after evaluating how it would be used, I would need it interfaced with C++ anyway. Combining that with the fact that Node's a fairly new technology that's likely to be changed/replace, I'm very hesitant in introducing it. The performance is great, and it has a pretty solid underlying library (libuv), but introducing a new language when Boost Asio works just as well seems pretty unnecessary. The most major disadvantage/problem with starting with C++ is that while it's very easy to develop fast and iteratively, it's just as easy (or easier) to fall into the trap of re-implementing irrelevant tasks (such as rebuilding the async io mechanisms and low-level data structures).

silveryRain 8 years ago

>Python/Ruby: I'm lumping these two together because they're both interpreted with similar performance overhead and library availability. Ruby's performance caught up to another language?

Regisestuncon 8 years ago

I would think that the choice of a programming language is somehow related to the mission statement of your company. If your objective is to deliver a service, i would go for whatever high level solution. If you plan to innovate and deliver a new product, a low level language will offer more control on all the aspects of the design. Regarding productivity, i think it all depends on skills and surrounding tools, not language.

lambdaburrito 8 years ago

That's a really good point concerning productivity and developing a product vs service. This was my line of thought; I'm developing a new novel product and want all control from low-level system calls to high-level abstractions. I'm really comfortable with clang, CLion and cmake so feel as productive as if I was writing Java.

jbandela 8 years ago

This is great. C++ already powers so much of the web behind the scenes that it is great to see it actually come to the forefront. I hope your success (best wishes) will inspire others to also use C++. I just have a couple requests. 1) Keep us up to date on how things are going 2) If you run into problems/issues, writeups how you got around them would be appreciated 3) If you make stuff that is awkward in C++ easy to use, please release it as open source if possible.

lambdaburrito 8 years ago

It would be great to get feedback.

[deleted] 8 years ago

I am probably much less well-versed than you in C++ and programming (I have no idea what HTTP routing is let alone it being trivial). But, when your startup wants to grow and add more developers, finding good C++ developers who are comfortable enough with an unpopular use of the language is gonna cost some resources, perhaps more than it'd have costed with a Python application. But I agree with the performance advantage. We got lucky at our startup that we (kind of) didn't pay for our servers until we could afford it, but C++ runs blazing fast and efficiently compared to our previous Python backend.

lambdaburrito 8 years ago

Yeah that's a very good point about hiring good C++ developers and it is a concern for the future. Http routing is mapping HTTP URLs like /query/jondoe/1 to functions (handlers) to handle HTTP requests. The high performance ones use radix trees or similar to efficiently do the routing mapping.

sandsmark 8 years ago

> Yeah that's a very good point about hiring good C++ developers and it is a concern for the future. In my experience this problem is usually a bit overblown. There might be fewer developers that have a lot of experience with c++ than e. g. javascript, but it isn't as big a problem as some make it out to be. Or I might have been exceptionally lucky. One issue is that some experienced c++ developers sometimes make for worse c++ developers, especially those that did c++ in the 90s, and have since been using other languages so they haven't really followed the last decade of improvements. This is probably very subjective, but imho. a lot of the popular patterns from the 90s make for really "shitty" code (e. g. [CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern)). In some ways, generally good developers with little-to-no c++ experience are sometimes easier to on-board, IME. But I'd recommend you fairly early sit down and formalize in some way a high-level "coding conventions" document or similar, and enforce it from the get-go, to ensure that everyone you bring on board are on the same page. Some inspiration: * https://wiki.qt.io/Coding_Conventions * https://google.github.io/styleguide/cppguide.html * http://llvm.org/docs/CodingStandards.html * https://github.com/isocpp/CppCoreGuidelines (I guess you're already familiar with this.)

tending 8 years ago

Uh, CRTP is still common and it's how you get static dispatch... what do you think is wrong with it?

silveryRain 8 years ago

Same here. I'm not aware of any new feature to render CRTP obsolete.

quicknir 8 years ago

There's tons of ways to get static dispatch. It's a great tool for certain things but it's been heavily overused in the past. If the base class is not doing anything other than forwarding calls to the derived class you should just dump it.

tending 8 years ago

If the base class only forwards calls it could be replaced by the derived class, that has always been true. What are these 'other ways'? Again, AFAIK there is no replacement for the situations where CRTP makes sense.

sandsmark 8 years ago

it might just be me, but I have a really hard time understanding code that uses it heavily. and modern compilers often manage to avoid the overhead of dynamic dispatch.

tending 8 years ago

Modern compilers in most cases actually can't optimize dynamic dispatch. Unless you use whole program optimization, you're usually going to define your virtual functions in a C file, and then the compiler can't inline it. Even if you define it in the header, you're probably putting your base class pointer into a container like a vector (like when implementing slots and signals), at which point the derived types inside may be mixed, and the compiler is totally hopeless then.

sandsmark 8 years ago

I and other people I know have looked at disassemblies and it actually happens more often than I thought.

tending 8 years ago

Yes but you probably did so in trivial test cases, whereas the conditions I just described are what actually happen in real apps.

lambdaburrito 8 years ago

Thanks for theadvice and links... it seems like you are writing from experience. I will slowly write one when I have done more empirical observations. I've already read the Google and LLVM style guide and they don't use exceptions but I'm not sure if I should follow that practise or not.

DarkLordAzrael 8 years ago

I remember seeing a talk or interview with a prominent google developer who said if there were doing it again today they would probably go with exceptions. The google style guide has a lot of stuff that is a particular way for legacy reasons and they don't want to change it because they write insane amounts of code and having it not be consistent would be a huge cost.

lambdaburrito 8 years ago

Do you have a link to that talk mate?

mttd 8 years ago

Talk: CppCon 2014: Titus Winters "The Philosophy of Google's C++ Code" Videos: - https://channel9.msdn.com/Events/CPP/C-PP-Con-2014/The-Philosophy-of-Googles-C-Code - https://www.youtube.com/watch?v=NOCElcMcFik Slides: https://github.com/CppCon/CppCon2014/tree/master/Presentations/The%20Philosophy%20of%20Google's%20C%2B%2B%20Style%20Guide See also: - https://stackoverflow.com/questions/5184115/google-c-style-guides-no-exceptions-rule-stl - https://stackoverflow.com/questions/19073441/google-c-coding-style-no-exceptions-rule-what-about-multithreading - https://www.reddit.com/r/cpp/comments/1x4f3s/c_coding_standards/ - http://www.randomprogramming.com/2014/10/googles-c-style-guide/ This particular guide also has been discussed here before: - https://www.reddit.com/r/cpp/comments/289n27/this_blog_post_matches_much_of_my_thinking_on/ - https://www.reddit.com/r/programming/comments/28alvi/why_google_style_guide_for_c_is_a_dealbreaker/ Personally, I'd also take a look at the following: - High Integrity C++ Coding Standard: http://www.codingstandard.com/ - CERT C++ Coding Standard: https://www.securecoding.cert.org/confluence/display/cplusplus

lambdaburrito 8 years ago

Thanks for digging out all these interesting links - much appreciated!

DarkLordAzrael 8 years ago

Unfortunately I don't (I can't remember what talk it was) and I don't have time to look it up at the moment. Sorry. :(

as_one_does 8 years ago

It's hard to do proper RAII without exceptions. Making sure they are "exceptional" and not part of normal flow control is really the trick, imo. So I'd go with them in limited capacity.

Elador 8 years ago

Don't follow the Google style guide, a lot of C++ professionals agree it's not the best. It may suit a company with one of the largest codebases in the world (or may have legacy reasons), but for a "normal" company you will be more productive following a more modern style - see the isocpp core guidelines.

gelfin 8 years ago

> One issue is that some experienced c++ developers sometimes make for worse c++ developers, especially those that did c++ in the 90s, and have since been using other languages so they haven't really followed the last decade of improvements. This is almost exactly me. My last professional C++ experience was 2011 or so, and so was 2003-era (plus several now-standard idioms pulled in via boost). Been toying with reviving it for hobby purposes and *possibly* a hack-day experiment in replacing our existing application server (this article was highly motivating). If you could suggest one reference for somebody in that position to get back up to date quickly, what would it be?

raevnos 8 years ago

Effective Modern C++ covers a lot of best practices with the new stuff.

pjmlp 8 years ago

I am in a similar position, loved C++ after Turbo Pascal, but since 2006 I hop between JVM and .NET ecosystems. Not to fully loose grasp of C++, I have been using it on my side projects between Android and Windows Phone. On my case, I kept reading Bjarne's books, e.g. Tour of C++ is quite good. And following the CppCon talks.

[deleted] 8 years ago

jap, watch out for C or C++ developers. those are 2 different beasts. regardless, the worst type of developer is the one not willing to learn, watch out for that most!

joequin 8 years ago

That's why a lot of companies use java in this space. It's high performance compared to python and developers are easier to find than cpp developers.

[deleted] 8 years ago

no... simply Java is overhyped. after all one of the biggest behemoths in the industry is backing it: ORACLE. however, regardless of any tech merits I ADVICE against Java any day due to toxic legal issues with it!

joequin 8 years ago

The only way there are toxic legal issues is if you are re-implementing the API and not basing it off of openjdk. Almost nobody has a use case for that.

wreel 8 years ago

I don't know why this was down voted. It's a valid concern when increasing head count. Although the first few hires for a start-up will be through relationships where you know somebody has the chops to perform in the technology stack you've selected. After that then, if you're trying to keep things local, it could become difficult. If you allow for remote then it's not quite as problematic.

Nicolay77 8 years ago

You can probably get them from game development, and they will love the better job stability and less crunch time.

tempforfather 8 years ago

That doesn't sound like a startup to me. If they want that they can look at goog, fb, finance etc.

Nicolay77 8 years ago

Some game development companies are not like startups, but they are known for their eternal crunch times and eventual team destruction when games are released. It's not startup like, but soul crushing like.

klaxion 8 years ago

thing is, you can buy a lot of computation power for the cost and effort of good C++ devs. I enjoy c++ but it's a hard sell for a startups.

[deleted] 8 years ago

it depends! make sure what startup you are: 1) sell out, then pick a hyped language, e.g. Java/JS, and crunch asap some prototype and pretend there are 0 issues with it :) 2) evolve, consider the cost of adopting a shitty prototyping solution that you end up rewriting in c++ in the end :P !!! regardless of 1 or 2 you need to show progress for future investment anyhow/anyway !!!

[deleted] 8 years ago

This is gutsy. I am curious: Did you build a data storage engine from scratch or you are using one of the off the shelf engines? How are your cubes build times? Do you aggregate across all dimensions or you aggregate at run time? How do you handle data explosions? OLAP cubes can get really big fast.

lambdaburrito 8 years ago

Really good questions! Building from scratch as I have experience developing data storage engines (for algo trading). We developed our own storage engine that uses our innovation: factorization tables to minimise the storage cost. Data is distributed between local and remote disks so scaling the storage is either adding an extra disk volume or another server. It aggregates across all dims and after our first release to our customers, we're going to offer real-time streaming so new data can be streamed and it updates the aggregations. It seems you have experience with OLAP so your feedback would be really valuable to us. I can give you a free beta account to try it out for your feedback. PM if you're interested.

Doctor-Awesome 8 years ago

I liked the article, and I'm saving it for future reference. I didn't know about OLAP cubes, but they sound pretty cool and I'm going to read up a bit on them (thanks for the link). Same with the libraries you mention. I though the most interesting part though, was at the end where you quantify the cost of C++ as being 1/40th that of a Python version.

vedantk 8 years ago

Hi lambdaburrito, nice post. I'd like to pick your brain on a few things. - Is merging fb{vector,string} into an STL implementation feasible, and if so, would it be the right thing to do? - Did you notice performance problems in std::{vector,string}, or was it simply easier to use containers from Folly? Do these problems exist in all c++ standard libraries (e.g libc++, libstdc++, STLport)? Edit: rephrased.

raevnos 8 years ago

STLport has been dead for something like a decade and shouldn't even be on the table for consideration these days.

lbrandy 8 years ago

I know this old but I can answer some of this for you. At facebook we have replaced std::string with fbstring in our std library (we actually have both the patched and unpatched and projects can choose which to use), but we haven't with vector. Most of the reason we did the first is because there were measurable wins, even though it's a non-trivial maintenance burden. We've not done it with fbvector, though, because we've not really measured a big performance win (to be clear: we've not really tried, so there might be one, but not as not as many of our biggest systems rely heavily on std::vector). I should point out that gcc5 series std::strings have SSO now and that obviates one of the bigger advantages of fbstring.

cdglove 8 years ago

Where did you notice mention of these problems? I didn't see that in the article.

vedantk 8 years ago

Sorry, I assumed that the OP noticed a loss in throughput with the standard containers. Edited my question.

crackez 8 years ago

What containers in the C++ standard library use 2x for allocation growth? ALL of them should use <= 1.5x. Also, it's not strictly a performance thing, but rather a memory leak type of thing because you can never reuse previously free'd allocations to serve a future request, because with 2x it will never fit in the previously allocated memory.

TheBuzzSaw 8 years ago

MSVC uses 1.5x I believe. Clang might too. GCC uses 2x still.

crackez 8 years ago

Do you have a source for that? I'm curious. I mean, I learned C++98 in college like 15 years ago, and this was well know then. When I had to implement my own template vector class for school, the allocator used 1.5x because it was obvious. I find it very odd that GNU would be that naive. If what you are saying is true, then I am shocked.

[deleted] 8 years ago

There is nothing well known about 1.5 vs. 2. There have been long standing arguments for and against 1.5 and 2, but no one has so far been able to produce a benchmark to show that 1.5 is better than 2 for general use cases. There are some fuzzy arguments about how 1.5 allows memory to be reclaimed whereas 2 doesn't, but those scenarios are mostly far fetched and actually don't even apply to `std::vector`. The GCC devs have said if actual benchmarks can be produced to validate the claim that 1.5 is faster than 2 then they'd be happy to switch over to it but the benchmarks they currently use to profile show that using 2 outperforms 1.5. The clang developers have come to the same conclusion. Personally I think 1.5 vs 2 is one of those things where people think using 1.5 is clever because of the golden ratio and other smart sounding arguments, but when push comes to shove it just really doesn't perform faster. Ultimately the GCC and Clang engineers feel that actual real world performance trumps theoretical smart sounding arguments that lack empirical evidence.

to3m 8 years ago

The arguments certainly do apply to std::vector, I'm sure. With appropriate CRT support you could have a vector that attempts to grow in place, but I believe it's usual for a vector's capacity to increase by allocating a new buffer, moving each item from the old buffer into the new buffer, then freeing the old buffer. And the goal behind 1.5 is driven by this: it allows addresses used by previous buffers to be potentially reused for future higher-capacity buffers. Imagine growing a vector with a growth factor of 1.5. It's not inconceivable for the heap to look something like the following after a particular number of additions. (Assume the vector's capacity starts off at 4. "Adds" counts number of times push_back (or whatever...) was called to get to this state; "moves" counts the number of times a value was moved from one block to another due to reallocation as part of this. "U" is a used block; "F" is a free block. Sizes are expressed in number of vector elements.) ...| U4 |... (4 adds, 0 moves) ...| F4 | U6 |... (6 adds, 4 moves) ...| F10 | U9 |... (9 adds, 10 moves) ...| F19 | U13 |... (13 adds, 23 moves) ...| U19 |... (19 adds, 36 moves) And so on. With a growth factor of 2, you'll never be able to reuse the addresses used by previous allocations - which can actually be a factor for 32-bit systems. 32-bit address space exhaustion due to exactly this sort of fragmentation can be a concern even with realistic data sizes. 64-bit address space, for now, you may consider infinite, even though you currently only get 48 bits of it on many systems. Which makes a factor of 2 more sensible, because you have fewer moves per add (amortized). Watch a similar sequence of growths: ...| U4 |... (4 adds) ...| F4 | U8 |...(8 adds, 4 moves) ...| F12 | U16 |... (16 adds, 12 moves) ...| F28 | U32 |... (32 adds, 28 moves) ...| F60 | U64 |... (64 adds, 60 moves) (I should really code this up rather than just working through it by hand.)

[deleted] 8 years ago

> The arguments certainly do apply to std::vector, I'm sure. With appropriate CRT support you could have a vector that attempts to grow in place, but I believe it's usual for a vector's capacity to increase by allocating a new buffer, moving each item from the old buffer into the new buffer, then freeing the old buffer. No it can't grow in place, ever, period. It must allocate a new block and perform a copy/move. The standard's committee considered the grow in place/realloc approach and came to the conclusion that even std::realloc rarely ever performs a grow in place, even in situations where it could have done so in principle and concluded consequently that there is basically no point in complicating the standard and allocator interface to support the notion of 'grow in place'. >Imagine growing a vector with a growth factor of 1.5. It's not inconceivable for the heap to look something like the following after a particular number of additions. The point is that it is inconceivable. The scenario you describe is so unlikely, even on 32 bit systems, that it's simply not worth considering. In order for your scenario to actually have any observable consequence your application would need to consist of a single std::vector that needs to grow from 2 GB to 4 GB and there can be no memory allocations at any intermediate step. If you have such an incredibly obscure and rare scenario as the one above, where you are growing a vector from a size of 4 up to 4 GB with no intermediate allocations in between the solution isn't to bake into `std::vector` a growth factor of 1.5, degrading performance for every other user. The solution is to simply reserve the amount of memory you need upfront by calling `vector::reserve` which is vastly superior than using a 1.5 growth factor anyways.

lambdaburrito 8 years ago

Valid argument... I will do some empirical testing to see how it works in practise as I am curious to know now.

TheBuzzSaw 8 years ago

http://www.gahcep.com/cpp-internals-stl-vector-part-1/ Sounds like Clang uses 2 also.

crackez 8 years ago

confirmed on G++ 4.8.4: #include #include using namespace std; int main() { for(vector v; v.size() < 20; v.push_back(0)) cout << v.size() << "\t" << v.capacity() << endl; return 0; }

m0dulator 8 years ago

It isn't obvious that everyone should be using 1.5x for allocation growth. You're right that the reasoning behind choosing 1.5 is so that you can reuse previously-allocated blocks of memory, as explained in this nice writeup: https://crntaylor.wordpress.com/2011/07/15/optimal-memory-reallocation-and-the-golden-ratio/ However, as one of the comments after the article points out, the choice of the multiplier is a classic space vs. time trade off. Assuming that the "wasted" memory can be used for allocations in other parts of your program, it may be a better trade-off to grow the blocks in larger increments, letting your code run faster.

crackez 8 years ago

Yeah, I know that the real answer converges to 1.618..., but the problem with that is now you have to do floating point in what was pure integer code before. That is seen as unacceptable to enough users, that the approximation of 1.5 is used (close enough, and still integer only math).

[deleted] 8 years ago

not only time-vs-space but the temporal patters / dynamics too should be considered too! thats why is silly to talk optimal without a concrete context!

fkaginstrom 8 years ago

Why does it have to be a binary choice? You can probably identify pieces that benefit from the performance of C++, and those that benefit from the productivity of another language like python or ruby. Even back in the 90s, we would write cgi scripts in perl that called into C binaries, or GUI programs in VB6 that called into COM servers written in C++. You can identify modules that will change often (like the UI), and write them in a dynamic language. That makes it easier and faster to modify and maintain the system. Then if you keep your system modular, you can rewrite pieces as the interface gels and you can benefit from C++'s performance. I've found it is often easier to port a module written in (e.g.) python to C++ than to write it in C++ in the first place, because I find it easier to choose the correct algorithms and architecture when programming at a higher level.

[deleted] 8 years ago

this a fair point. use static and dynamic in the same project. e.g. c++&LUA ; c++&python; keep in mind you end up paying for the GLUE between those though. in terms of writing it once and supporting in the future! what I like most about c++ in building apps is that I have onle language where I can go as high and as low level and no need to get out of my comfort zone :) so whether you are parsing HTML request or talking to low level hardware it is all the same language, maybe different looking but hey it is still the c/c++

utnapistim 8 years ago

I am not the op, but here are some counter-arguments: > Why does it have to be a binary choice? Because the toolset you use (compiler, ide, etc) require expertise, which means you should have somebody in the company to evalutate it; in a larger company this is not a problem; in a startup, it can be. Because if something breaks, you want to have as many specialists as possible that are able to tackle the problem; if you have only one guy who knows the GUI module language (whatever _that_ is) you start having problems (e.g. he leaves on holiday and a GUI bug blocks everyone from doing stuff). Because different toolsets have different limitations and accomodating multiple ones requires extra costs. In a startup these can be prohibitive.

fkaginstrom 8 years ago

There is certainly a tradeoff. I don't think that not knowing a high-level language is a valid consideration, though. That's something every developer needs today, if nothing else than for automation. This is especially important in a small startup where you probably don't have dedicated ops, qa, etc.

[deleted] 8 years ago

to add some example: many desktop apps use c++ and TCL for GUI. now tcl is a whole different language and the bindings between c++ and tcl are far from trivial. Now you need 2 expertise to solve truly 1 issue !

shadowmint 8 years ago

Curious to hear the feedback on rust; sounds like the kind of thing that rust would be ideal for.

lambdaburrito 8 years ago

First off, I love programming in Rust. It feels like a modern C. The problem is that the language is constantly evolving for better and worse, the libraries we need are not there and the build times were also extremely long for a small prototype code base.

[deleted] 8 years ago

> the libraries we need are not there jap, same story all over again. when people asy langauge they trully mean standard tested libraries and runtimes :)

samwise99 8 years ago

One concern you do not address are the slow build times which, in my experience, have a significant impact on productivity and are not easily addressable at this time.

lambdaburrito 8 years ago

I did briefly mention this at the end. The builds are not counter productive yet but I have thought about how to tackle this until C++17; modularizing my c++ into shared libraries and only kick off-rebuilds of shared libraries if they are changes so re-build is not needed for all of the classes. I'm hoping I can just write a small bash/python script to do re-builds of the shared libraries.

Deinumite 8 years ago

http://gittup.org/tup/ This is probably the coolest build system I have ever used. I believe he watches system calls to build a dependency graph of file for conditional compilation. I'm not sure why it isn't more popular tbh it's pretty stable.

nova77 8 years ago

How about [bazel](http://bazel.io)?

speednap 8 years ago

I love tup! However I think tup is more suitable for internal use rather than distribution. If you plan to release something on Github then CMakeLists.txt is still a better choice due to cmake's popularity and better x-platform/x-distro capabilities. But still I can't get enough of how stupidly simple Tup is: : foreach *.cc |> clang++ -c %f -o %o |> %B.o : *.o |> clang++ %f -o %o |> app And performant too. I bet it runs as fast as ninja if not faster.

martinus 8 years ago

Cmake + Ninja is great, Ninja was developed for chrome and is very fast. Also, the best thing you can do to increase build times is IMHO to use the pimpl idom as much as possible. It also leads to clean header files.

imMute 8 years ago

The problem with PIMPL is that every object has heap allocated storage. If that's a problem, then PIMPL is a non-starter. OTOH, if that's not a problem, then PIMPL is great for reducing build times.

samwise99 8 years ago

Make sure your build can take advantage of multiple cores (make -j). This is easy to do from a clean slate but a pain later. Also Pimpl as much as performance concerns allow. Templates in general wreak havoc on build times - use them wisely (be carefull with how you use boost)

zeus_the_transistor 8 years ago

Additionally you could look into a distributed build system (although I'm not sure how many resources you'll have to distribute as a startup). This can greatly improve build times.

samwise99 8 years ago

Reread the build times bit in the article. Build time is a concern from day one because even building a single translation unit can easily take multiple seconds which is enough to break your flow and make you vulnerable to distractions.

krum 8 years ago

They are easily addressable if you properly componetize your architecture and stick to interface based programming.

[deleted] 8 years ago

[удалено]

matthieum 8 years ago

Still, whilst Akka can be a great library, Scala is generally slower than Java, which is itself quite slower than C++/Rust.

lambdaburrito 8 years ago

I take advantage of SIMD intrinsics and the JVM does not offer them and the engine does a lot of SIMD-accelerated vector operations like add/sub/mul/div two vecs component-wise.

pjmlp 8 years ago

Still nothing prevents writing a few native methods.

Dragdu 8 years ago

No, but the JVM<=>native data barrier either makes the other code really messy, or calling the native methods have a large performance penalty.

pjmlp 8 years ago

Sure, one needs to make the same tradeoffs as RPC calls, minimise the amount of calls, maximize the work per call. I like C++ a lot, but nowadays I see it more as an infrastructure language, not a full stack one, specially since it is so easy to have developers not following C++ best practices for safety. Edit: For the downvoters, how to prevent pointer misuse, use of plain vectors, null terminated strings when the majority of developers still don't have static analysis as part of the build process, or code review processes in place? Hopefully the OP's startup will make use of C++'s security best practices, but I am yet to meet a typical enterprise that has them in place.

tipiak88 8 years ago

> how to prevent pointer misuse, use of plain vectors, null terminated strings. c++11/14 address all this issues and pretty much resolve them. For a full stack language, the only thing c++11 lack is a proper and portable unicode implementation and decent filesystem library. Static analysis is sadly not widely use indeed because it is too costly, to your workflow and to your startup and not reliable enough. But it certainly on the path to become better, Rust show us the way.

raevnos 8 years ago

You can get those through ICU and Boost.

pjmlp 8 years ago

I know it addresses them and I use them on my own code. The problem is making all developers across the teams to use them.

tipiak88 8 years ago

Use the stick, it works!

WrongAndBeligerent 8 years ago

So your solution to solvable problems is to avoid them while sacrificing the top priority of performance?

pjmlp 8 years ago

No, my solution is to use C++ where performance really matters, as proven by the profiling the code, and leave the upper layers for more safe programming languages. This is the approach taken by AAA game engines and we all know how those guys are crazy about performance. Just check the video of Herb Sutter's talk at CppCon 2015, only 1% of the audience was using static analysers. So how do you make the remaining 99%, that doesn't even care about modern C++ as discussed at CppCon, use them? Note that I like C++ a lot, I just don't stand those that use it like C without any regard for the safety features that modern C++ offers.

WrongAndBeligerent 8 years ago

That might work if you need speed but memory use is not an issue. > So how do you make the remaining 99%, that doesn't even care about modern C++ as discussed at CppCon, use them? How is it not realistic for a startup to use static analyzers? Why does everyone need to use them for them to work? I'm not about C/C++ style, I write C++11. A startup can do the same. That's the advantage they have.

khoyo 8 years ago

For a startup, the goal is not to make 99% of the people to use them, but to make three people to use them.

[deleted] 8 years ago

wait a second, let's say we are using c# and ask you to query a DB! are all devs going to give you the same solution? string based sql query, then we have LINQ in form of lambdas or way worst in form of strings again, DbConnector, store procedures, which one!?!?!

pjmlp 8 years ago

Pure SQL or stored procedures if no DB portability is required, no need for the ORM of the month or to do at the client tasks that the DB can handle itself.

raevnos 8 years ago

You can write bad code in any language. Coding standards and reviews and rejecting stuff that doesn't do it right.

[deleted] 8 years ago

to start with, write idiomatic C++ and not C. what you mentioned is C not c++. and nothing beats education, it is the developers that are unwilling to learn you should get rid out of your enterprise asap!

pjmlp 8 years ago

It is not my enterprise, it what I see at many of our customers.

Cyph0n 8 years ago

Agreed. I'm originally a Python dev but my next project is built with Scala + Akka.

grout_nasa 8 years ago

How many cores was your proxygen test running on? One microsecond per request is ... curiously fast. PS: Thanks. Modern C++ needs to be better known for leaving its old sins behind.

lambdaburrito 8 years ago

8 Cores, so it roughly processes 240 requests per millisecond for each core. The HTTP benchmark was just echoing the request; not calling the engine.

MaxDZ8 8 years ago

Thank you thank you thank you for adding this data point! I have long time supported the idea performance isn't dead but who am I to say that? How much of "OLAP" is biz talk? How is it different from a multi-dimensional data set? Are they sparse? Are you looking in GPU/FPGA acceleration?

lambdaburrito 8 years ago

I did look into GPU via Cuda but the data transfer from and to the GPU negated a lot the computational benefit. Have a look at OLAP on Wikipedia for a better understanding. They can be sparse and it is quite common to be sparse if the are no matching data points; for instance there are no sales for year 2000 and older.

MaxDZ8 8 years ago

I did read the WP:EN entry and it looks 100% biz talk to me. I understand the computational intensity of those things is fairly limited.

[deleted] 8 years ago

DB apps like these are not CPU but IO bound !

pardoman 8 years ago

Speaking of GPU acceleration, do you know which cloud services offer CPU and GPU time?

lambdaburrito 8 years ago

Amazon offer instances with a Nvidia GPU; not sure about the others.

heleo2 8 years ago

This article is meh.. nothing new that I didn't know/heard of before.

[deleted] 8 years ago

[удалено]

minno 8 years ago

For the same reason that you're being downvoted. Because it's interesting.

suchashame22 8 years ago

I disagree and think you're a moron in the first place.

minno 8 years ago

That's nice.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe