likes
comments
collection
share

OutOfMemoryError是如何产生的

作者站长头像
站长
· 阅读数 14

背景

其实这个问题也挺有趣的,OutOfMemoryError,算是我们常见的一个错误了,大大小小的APP,永远也逃离不了这个Error,那么,OutOfMemroyError是不是只有才分配内存的时候才会发生呢?是不是只有新建对象的时候才会发生呢?要弄清楚这个问题,我们就要了解一下这个Error产生的过程。

OutOfMemoryError

我们常常在堆栈中看到的OOM日志,大多数是在java层,其实,真正被设置OOM的,是在ThrowOutOfMemoryError这个native方法中

void Thread::ThrowOutOfMemoryError(const char* msg) {
  LOG(WARNING) << "Throwing OutOfMemoryError "
               << '"' << msg << '"'
               << " (VmSize " << GetProcessStatus("VmSize")
               << (tls32_.throwing_OutOfMemoryError ? ", recursive case)" : ")");
  ScopedTrace trace("OutOfMemoryError");
  jni调用设置ERROR
  if (!tls32_.throwing_OutOfMemoryError) {
    tls32_.throwing_OutOfMemoryError = true;
    ThrowNewException("Ljava/lang/OutOfMemoryError;", msg);
    tls32_.throwing_OutOfMemoryError = false;
  } else {
    Dump(LOG_STREAM(WARNING));  // The pre-allocated OOME has no stack, so help out and log one.
    SetException(Runtime::Current()->GetPreAllocatedOutOfMemoryErrorWhenThrowingOOME());
  }
}

下面,我们就来看看,常见的抛出OOM的几个路径

MakeSingleDexFile

在ART中,是支持合成单个Dex的,它在ClassPreDefine阶段,会尝试把符合条件的Class(比如非数据/私有类)进行单Dex生成,这里我们不深入细节流程,我们看下,如果此时把旧数据orig_location移动到新的final_data数组里面失败,就会触发OOM

static std::unique_ptr<const art::DexFile> MakeSingleDexFile(art::Thread* self,
                                                             const char* descriptor,
                                                             const std::string& orig_location,
                                                             jint final_len,
                                                             const unsigned char* final_dex_data)
      REQUIRES_SHARED(art::Locks::mutator_lock_) {
  // Make the mmap
  std::string error_msg;
  art::ArrayRef<const unsigned char> final_data(final_dex_data, final_len);
  art::MemMap map = Redefiner::MoveDataToMemMap(orig_location, final_data, &error_msg);
  if (!map.IsValid()) {
    LOG(WARNING) << "Unable to allocate mmap for redefined dex file! Error was: " << error_msg;
    self->ThrowOutOfMemoryError(StringPrintf(
        "Unable to allocate dex file for transformation of %s", descriptor).c_str());
    return nullptr;
  }
  

unsafe创建

我们java层也有一个很神奇的类,它也能够操作指针,同时也能直接创建类对象,并操控对象的内存指针数据吗,它就是Unsafe,gson里面就大量用到了unsafe去尝试创建对象的例子,比如需要创建的对象没有空参数构造函数,这里如果malloc分配内存失败,也会产生OOM

static jlong Unsafe_allocateMemory(JNIEnv* env, jobject, jlong bytes) {
  ScopedFastNativeObjectAccess soa(env);
  if (bytes == 0) {
    return 0;
  }
  // bytes is nonnegative and fits into size_t
  if (!ValidJniSizeArgument(bytes)) {
    DCHECK(soa.Self()->IsExceptionPending());
    return 0;
  }
  const size_t malloc_bytes = static_cast<size_t>(bytes);
  void* mem = malloc(malloc_bytes);
  if (mem == nullptr) {
    soa.Self()->ThrowOutOfMemoryError("native alloc");
    return 0;
  }
  return reinterpret_cast<uintptr_t>(mem);
}

Thread 创建

其实我们java层的Thread创建的时候,都会走到native的Thread创建,通过该方法CreateNativeThread,其实里面就调用了传统的pthread_create去创建一个native Thread,如果创建失败(比如虚拟内存不足/FD不足),就会走到代码块中,从而产生OOM

void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
   ....

    if (pthread_create_result == 0) {
      // pthread_create started the new thread. The child is now responsible for managing the
      // JNIEnvExt we created.
      // Note: we can't check for tmp_jni_env == nullptr, as that would require synchronization
      //       between the threads.
      child_jni_env_ext.release();  // NOLINT pthreads API.
      return;
    }
  }

  // Either JNIEnvExt::Create or pthread_create(3) failed, so clean up.
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    runtime->EndThreadBirth();
  }
  // Manually delete the global reference since Thread::Init will not have been run. Make sure
  // nothing can observe both opeer and jpeer set at the same time.
  child_thread->DeleteJPeer(env);
  delete child_thread;
  child_thread = nullptr;
  如果没有return,证明失败了,爆出OOM
  SetNativePeer(env, java_peer, nullptr);
  {
    std::string msg(child_jni_env_ext.get() == nullptr ?
        StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
        StringPrintf("pthread_create (%s stack) failed: %s",
                                 PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
    ScopedObjectAccess soa(env);
   
    soa.Self()->ThrowOutOfMemoryError(msg.c_str());
  }
}

堆内存分配

我们平时采用new 等方法的时候,其实进入到ART虚拟机中,其实是走到Heap::AllocObjectWithAllocator 这个方法里面,当内存分配不足的时候,就会发起一次强有力的gc后再尝试进行内存分配,这个方法就是AllocateInternalWithGc

mirror::Object* Heap::AllocateInternalWithGc(Thread* self,
                                             AllocatorType allocator,
                                             bool instrumented,
                                             size_t alloc_size,
                                             size_t* bytes_allocated,
                                             size_t* usable_size,
                                             size_t* bytes_tl_bulk_allocated,
                                             ObjPtr<mirror::Class>* klass) 

流程如下: OutOfMemoryError是如何产生的

void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type) {
  // If we're in a stack overflow, do not create a new exception. It would require running the
  // constructor, which will of course still be in a stack overflow.
  if (self->IsHandlingStackOverflow()) {
    self->SetException(
        Runtime::Current()->GetPreAllocatedOutOfMemoryErrorWhenHandlingStackOverflow());
    return;
  }
  这里官方给了一个钩子
  Runtime::Current()->OutOfMemoryErrorHook();
  输出OOM的原因
  std::ostringstream oss;
  size_t total_bytes_free = GetFreeMemory();
  oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free
      << " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM,"
      << " target footprint " << target_footprint_.load(std::memory_order_relaxed)
      << ", growth limit "
      << growth_limit_;
  // If the allocation failed due to fragmentation, print out the largest continuous allocation.
  if (total_bytes_free >= byte_count) {
    space::AllocSpace* space = nullptr;
    if (allocator_type == kAllocatorTypeNonMoving) {
      space = non_moving_space_;
    } else if (allocator_type == kAllocatorTypeRosAlloc ||
               allocator_type == kAllocatorTypeDlMalloc) {
      space = main_space_;
    } else if (allocator_type == kAllocatorTypeBumpPointer ||
               allocator_type == kAllocatorTypeTLAB) {
      space = bump_pointer_space_;
    } else if (allocator_type == kAllocatorTypeRegion ||
               allocator_type == kAllocatorTypeRegionTLAB) {
      space = region_space_;
    }

    // There is no fragmentation info to log for large-object space.
    if (allocator_type != kAllocatorTypeLOS) {
      CHECK(space != nullptr) << "allocator_type:" << allocator_type
                              << " byte_count:" << byte_count
                              << " total_bytes_free:" << total_bytes_free;
      // LogFragmentationAllocFailure returns true if byte_count is greater than
      // the largest free contiguous chunk in the space. Return value false
      // means that we are throwing OOME because the amount of free heap after
      // GC is less than kMinFreeHeapAfterGcForAlloc in proportion of the heap-size.
      // Log an appropriate message in that case.
      if (!space->LogFragmentationAllocFailure(oss, byte_count)) {
        oss << "; giving up on allocation because <"
            << kMinFreeHeapAfterGcForAlloc * 100
            << "% of heap free after GC.";
      }
    }
  }
  self->ThrowOutOfMemoryError(oss.str().c_str());
}

这个就是我们常见的,也是主要OOM产生的流程

JNI层

这里还有很多,比如JNI层通过Env调用NewString等分配内存的时候,会进入条件检测,比如分配的String长度超过最大时产生Error,即使说内存空间依旧可以分配,但是超过了虚拟机能处理的最大限制,也会产生OOM

    if (UNLIKELY(utf16_length > static_cast<uint32_t>(std::numeric_limits<int32_t>::max()))) {
      // Converting the utf16_length to int32_t would overflow. Explicitly throw an OOME.
      std::string error =
          android::base::StringPrintf("NewStringUTF input has 2^31 or more characters: %zu",
                                      utf16_length);
      ScopedObjectAccess soa(env);
      soa.Self()->ThrowOutOfMemoryError(error.c_str());
      return nullptr;
    }
    

OOM 路径总结

通过本文,我们看到了OOM发生时,可能存在的几个主要路径,其他引起OOM的路径,也是在这几个基础路径之上产生的,希望大家以后可以带着源码学习,能够帮助我们了解ART更深层的秘密。